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Box PCT 

U.S. Patent and Trademark Office 
P.O. Box 2327 
Arlington, VA 22202 

PRELIMINARY AMENDMENT 

Dear Sir: 

Before calculation of the filing fee, please amend the 
claims of the above-referenced patent application, as follows: 



In the Claims : 



Please amend claims 1, 8-15, 17-23, 26, 29, 33, 35, 37, 
38 and 42 as follows: 



I. (Amended) A DNA sequence which is selected from the group 
consisting of (a) at least part of the sequence set out in 
the appended sequence listing; and (b) a variant of a 
sequence (a) which encodes a polypeptide which is at least 
80%, identical with the corresponding peptide as set out in 
table II; provided that it is not a sequence encoding all 
or part of the polypeptide consisting of amino acids 1-92 0 
encoded by mon AI as set out in table II. 

8. (Amended) A DNA sequence according to claim 1 encoding any 
one or more of the domains as set out in Table I or a 
variant or part thereof. 

9 . (Amended) A DNA sequence according to claim 1 which has a 
length of at least 30 bases. 

10. (Amended) A recombinant cloning or expression vector 
comprising a DNA sequence according to claim 1. 

II. (Amended) A transformant host cell which has been 
transformed to contain a DNA sequence according to claim 1 
and which is capable of expressing a corresponding 
polypeptide . 
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12. (Amended) A hybridisation probe which is a DNA sequence 
according to claim 1 . 

13. (Amended) A method of detecting a PKS cluster comprising 
using a probe according to claim 12 to detect a PKS 
cluster, optionally followed by isolation of the detected 
cluster. 

14. (Amended) A method of detecting genes comprising using a 
probe according to claim 12 which encodes at least part of 
a polypeptide having a known function to detect genes 
encoding polypeptides having analogous function. 

15. (Amended) A method according to claim 14 wherein the 
polypeptide of known function is AT of module 5 or the 
regulatory protein encoded by mon RI . 

17. (Amended) A method of detecting the presence of a gene 
cluster which governs the synthesis of a polyether, which 
comprises using a probe according to claim 16, and 
optionally isolating a gene cluster detected thereby. 

18. (Amended) A method of detecting a gene comprising using a 
probe according to claim 12 which comprise a polynucleotide 
which binds specifically to a gene responsible for levels 
of activity of the monensin gene cluster, for detecting an 
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analogous gene in a gene cluster for biosynthesis of 
. another polyketide, optionally followed by a step of 
manipulating the gene detected thereby to alter the level 
of expression of said other polyketide. 

19. (Amended) A method according to claim 18 wherein the gene 
is a regulatory gene, resistance gene or thioesterase gene. 

20. (Amended) A method of expressing a heterologous gene in S. 
cinnamonesls comprising inserting said gene so that it is 
expressed under the control of the mon RI gene or variant 
and a monensin promoter. 

21. (Amended) A method of expressing a polyketide other than 
monensin which includes using a portion of the monensin 
gene cluster encoding a polypeptide having chain 
terminating activity, comprising at least one of mon AIX 
and jnon AX or a mutant, allele or other variant thereof 
encoding a polypeptide having chain terminating activity, 
to effect chain release of said polyketide other than 
monensin. 

22 . (Amended) A method of synthesising a polyketide other than 
monensin which includes using a portion of the monensin 
gene cluster encoding a polypeptide having carbon-carbon 
double bond isomerase activity comprising at least one of 
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mon BJ and .mon BJJ or a mutant, allele or other variant 
thereof having isomerase activity to provide a desired 
stereochemical outcome in the synthesis of said polyketide 
other than monensin. 

23. (Amended) A polypeptide encoded by a portion of the 
monensin gene cluster, comprising at least one portion 
selected from mon BI and mon BIT or a mutant, allele or 
other variant thereof, having carbon-carbon double bond 
isomerase activity, or at least one of mon AIX and mon AX 
or a mutant, allele or other variant thereof having chain 
terminating activity. 

26. (Amended) A method for the biosynthesis of a polyketide 
other than monensin which comprises using a portion of the 
monensin gene cluster encoding a peptide having epoxidase 
or cyclase activity, to provide a said activity in the 
biosynthesis of said polyketide other than monensin. 

29. (Amended) A process according to claim 27 wherein the 
starter unit also includes an AT q domain derived from an AT 
domain which is naturally associated with the KS domain. 
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33. (Amended) A DNA sequence according to claim 30 wherein said 
loading module is adapted to load a starter unit other than 
a starter unit normally received by the adjacent extension 
module . 

35. (Amended) A polyketide synthase encoded by the DNA sequence 
of claim 30. 

37. (Amended) A vector containing a DNA sequence of claim 30. 

38. (Amended) A transformant cell transformed to contain a DNA 
sequence of claim 30. 

42 . (Amended) A method of producing monensin comprising 
culturing the organism of claim 41. 

Please add new claims 46 and 47 as follows: 

46. (New) A DNA sequence according to claim 1 which is .a 
variant of a sequence (a) which encodes a peptide which is 
at least 90% identical with the corresponding peptide as 
set out in table II. 

47. (New) A DNA sequence according to claim 1 which has a 
length of at least 60 bases. 
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REMARKS 

The purpose of this Preliminary Amendment is to 
eliminate multiple claims dependencies, revise claims which, due 
to their form, do not comply with current U.S. Patent and 
Trademark Office practice, and to present additional claims 
directed to preferred embodiments of the invention. 

The foregoing amendments do not introduce new matter 
into the present application, and, therefore, should be entered 
without objection. 

Early and favorable consideration of the present 
application is respectfully requested. 

Respectfully submitted, 

Patrick J. Hagan 
Reg. No. 27,643 
Attorney for Applicant 

PJH : ksk 
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MARKEP-UP COPY OF THE CLAIMS 

1. (Amended) A DNA sequence which is selected from the group 
consisting of (a) at least part of the sequence set out in 
the appended sequence listing; [or] and (b) a variant of a 
sequence (a) which encodes a polypeptide which is at least 
80%, [preferably at least 90%], identical with the 
corresponding peptide as set out in table II; provided that 
it is not a sequence encoding all or part of the 
polypeptide consisting of amino acids 1-920 encoded by mon 
AI as set out in table II. 

8- (Amended) A DNA sequence according to [any preceding] claim 
1 encoding any one or more of the domains as set out in 
Table I or a variant or part thereof. 

9. (Amended) A DNA sequence according to [any preceding] claim 
1 which has a length of at least 30 [, preferably at least 
60 , ] bases . 

10. (Amended) A recombinant cloning or expression vector 
comprising a DNA sequence according to [any preceding] 
claim 1. . 

11. (Amended) A transformant host cell which has been 
transformed to contain a DNA sequence according to [any of 
claims 1-9] claim 1 and which is capable of expressing a 
corresponding polypeptide. 



(Amended) A hybridisation probe which is a DNA sequence 
according to [any of claims 1-9] claim 1 . 



(Amended) A method of detecting a PKS cluster comprising 
using [Use of] a probe according to claim 12 to detect a 
PKS cluster, optionally followed by isolation of the 
detected cluster - 

(Amended) A method of detecting genes comprising using [Use 
of] a probe according to claim 12 which encodes at least 
part of a polypeptide having a known function to detect 
genes encoding polypeptides having analogous function. 

(Amended) A method [Use] according to claim 14 wherein the 
polypeptide of known function is AT of module 5 or the 
regulatory protein encoded by mon RI . 

(Amended) [Use of a probe according to claim 16 in a] A 
method of detecting the presence of a gene cluster which 
governs the synthesis of a polyether, which comprises using 
a probe according to claim 16 , and optionally isolating a 
gene cluster detected thereby. 

(Amended) [Use of] A method of detecting a gene comprising 
using a probe according to claim 12 which comprise a- 
polynucleotide which binds specifically to a gene 



responsible for levels of activity of the monensin gene 
cluster, [in a method of] for detecting an analogous gene 
in a gene cluster for biosynthesis of another polyketide, 
optionally followed by a step of manipulating the gene 
detected thereby to alter the level of expression of said 
other polyketide. 

19. (Amended) A method [Use] according to claim 18 wherein the 
gene is a regulatory gene, resistance gene or thioesterase 
gene . 

20 . (Amended) A method of expressing a heterologous gene in S. 
cinnamonesis comprising inserting said gene so that it is 
expressed under the control [Use] of the mon RI gene or 
variant and a monensin promoter [to control expression of 
a heterologous gene in S. clnnamonensis ] . 

21. (Amended) A method of expressing a polyketide other than 
monensin which includes using [Use of] a portion of the 
monensin gene cluster encoding a polypeptide having chain 
terminating activity, [preferably] comprising at least one 
of mon AIX and mon AX or a mutant, allele or other variant 
thereof encoding a polypeptide having chain terminating 
activity, to effect chain release of [a peptide] said 
polyketide other than monensin. 
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22 . (Amended) A method of svnthesising a polvketide other than 
monensin which includes using [Use of] a portion of the 
monensin gene cluster encoding a polypeptide having carbon- 
carbon double bond isomerase activity [, preferably] 
comprising at least one of mon BI and mon BIX or a mutant, 
allele or other variant thereof having isomerase activity 
to provide a desired stereochemical outcome in the 
synthesis of [a] said polyketide other than monensin. 

23. (Amended) A polypeptide encoded by a portion of the 
monensin gene cluster, [preferably] comprising at least one 
[of] portion selected from mon BI and mon BII or a mutant, 
allele or other variant thereof, having carbon-carbon 
double bond isomerase activity, or at least one of mon AIX 
and mon AX or a mutant, allele or other variant thereof 
having chain terminating activity. 

26. (Amended) A method for the biosynthesis of a polyketide 
other than monensin which comprises using [Use of] a 
portion of the monensin gene cluster encoding a peptide 
having epoxidase or cyclase activity, [preferably 
comprising mon CI or mon CII or a mutant, allele or other 
variant thereof encoding a polypeptide having epoxidase or 
cyclase activity] to provide a said activity in the 
biosynthesis of [a polypeptide] said polvketide other than 
monensin. 
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29. (Amended) A process according to claim 27 [or claim 28] 
wherein the starter unit also includes an AT q domain derived 
from an AT domain which is naturally associated with the KS 
domain . 

33. (Amended) A DNA sequence according to claim 3 0 [, 31 or 32] 
wherein said loading module is adapted to load a starter 
unit other than a starter unit normally received by the 
adjacent extension module. 

35. (Amended) A polyketide synthase encoded by the DNA sequence 
of [any of claims 30-34] claim 30 . 

37. (Amended) A vector containing a DNA sequence of [any of 
claims 30-34] claim _. 30 . 

38. (Amended) A transformant cell transformed to contain a DNA 
sequence of [any of claims 30-34] claim 30 . 

42. (Amended) A method of producing monensin comprising 
culturing the organism of claim 41 [and/or an organism 
produced by the method of claim 39 or claim 40] . 
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POLYKET I PES AND THEIR SYNTHESIS 



The present invention relates to processes and 
materials (including enzyme systems, nucleic acids, 
5 vectors and cultures) for preparing polyketides, 

particularly polyethers but including polyenes, 
macrolides and other polyketides by recombinant 
synthesis, and to the polyketides so produced, 
particularly novel polyketides. (N.B the term 

10 "polyketide" is being used in its conventional sense to 

include structures notionally derived by the reduction 
and/or other processing or modification of one or more 
Ketide units) . Furthermore the invention provides the 
entire nucleic acid sequence of the biosynthetic gene 

15 cluster that governs the production of the ionophoric 

antibiotic polyether polyketide monensin in Streptomyces 
cinnamonensis, and the use of all or part of the, cloned 
DNA first, in the specific detection of other polyether 
biosynthetic gene clusters; secondly in the engineering 

20 of mutant strains of S. cinnamonensis and of other 

actinomycetes which are suitable"**host strains for the 
high level production of novel recombinant polyketides ; 
and thirdly in the provision of recombinant biosynthetic 
genes which lead to such novel polyketide products. 

25 Polyketides are a large and structurally diverse 
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class of natural products that includes many compounds 
possessing antibiotic or other pharmacological 
properties, such as erythromycin, tetracyclines, 
rapamycin, avermectin, monensin, epothilones and FK506. 
5 In particular, polyketides are abundantly produced by 

Streptomyces and related actinomycete bacteria. They are 
synthesised by the repeated stepwise condensation of 
acylthioesters in a manner analogous to that of fatty 
acid biosynthesis. The greater structural diversity found 
LO among natural polyketides arises from the selection of 

(usually) acetate or propionate as "starter" or 
"extender" units; and from the differing degree of 
processing of the p-keto group observed after each 
condensation. Examples of processing steps include 
15 reduction to p-hydroxyacyl-, reduction followed by 

dehydration to 2-enoyl-, and complete reduction to the 
saturated acylthioester . The stereochemical outcome of 
these processing steps is also specified for each cycle 
of chain extension. In addition, the biosynthetic 
20 pathways to many polyketides involve additional enzyme- 

catalysed modifications which may include: methylation by 
O- and C-rnethyltransferases, hydroxylation by cytochrome 
P450 enzymes, other oxidation or reduction processes, and 
the biosynthesis and attachment of novel sugars and/or 
25 deoxy sugars. 
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The biosynthesis of polyketides is initiated by a 
group of chain-forming enzymes known as polyketide 
synthases. Two classes of polyketide synthase (PKS) have 
been described in actinomycetes . One class, named Type I 
5 PKSs, represented by the PKSs for the macrolides 

erythromycin, oleandomycin, avermectin and rapamycin, 
consists of a different set or "module" of enzymes for 
each cycle of polyketide chain extension. (For examples 
see Cortes, J. et al. Nature (1990) 348:176-178; Donadio, 

10 S. et al. Science (1991) 252:675-679; Swan, D.G. et al. 

Mol. Gen. Genet. (1994) 242:358-362; MacNeil, D.J. et al. 
Gene (1992) 115:119-125; Schwecke, T. et al. Proc. Natl. 
Acad. Sci. USA (1995) 92:7839-784 3.) 

The term "extension module" as used herein refers to 

15 the set of contiguous domains, from a p-ketoacyl-ACP 

synthase ("KS") domain to the next acyl carrier protein 
("ACP") domain, which accomplishes one cycle of 
polyketide chain extension. The term "loading module" is 
used to refer to any group of contiguous domains which 

20 accomplishes the loading of the starter unit onto the PKS 

and thus renders it available tcrthe KS domain of the 
first extension module. The length of polyketide formed 
has been altered, in the case of erythromycin 
biosynthesis, by speci fie * relocation using genetic 

25 engineering of the enzymatic domain of the erythromycin- 
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producing PKS that contains the chain releasing 
thioesterase/cyclase activity (Cortes J. et al. Science 
(1995) 268:1487-1489; Kao, CM. et al. J. Am. Chem. Soc. 
(1995) 117: 9105-9106) . 
5 In-frame deletion of the DNA encoding part of the 

ketoreductase domain in module 5 of the erythromycin- 
producing PKS (also known as 6^deoxyerythronolide B 
synthase, DEBS) has been shown to lead to the formation 
of erythromycin analogues 5, 6-dideoxy-3-a-mycarosyl-5- 
10 oxoerythronolide B, 5 , 6-dideoxy-5-oxoerythronolide B and 

5. 6- dideoxy, 6-p-epoxy-5-6xoerythronolide B (Donadio, S. 
et al. Science (1991) 252:675-67 9). Likewise, alteration 
of active site residues in the enoylreductase domain of 
module 4 in DEBS, by genetic engineering of the 

15 corresponding PKS-encoding DNA and its introduction into 

Saccharopolyspora erythraea , led to the production of 

6. 7- anhydroerythromycin C (Donadio, S. et al. Proc. Natl. 
Acad. Sci. USA (1993) 90:7119-7123). 

International Patent Application number WO 93/13663 
20 describes additional types of genetic manipulation of the 

DEBS genes that are capable of producing altered 
polyketides. However many such attempts are reported to 
have been unproductive (Hutchinson, C.R. and Fujii, I. 
Annu. Rev. Microbiol. (1995) 49:201-238, at p. 231)- The 
25 complete DNA sequence of the genes from Streptomyces 
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hygroscopicus that encode the modular Type I PKS 
governing the biosynthesis of the macrocyclic 
immunosuppressant polyketide rapamycin has been disclosed 
(Schwecke, T. et al. (1995) Proc. Natl. Acad. Sci. USA 

5 92:7839-7843). The DNA sequence is deposited in the 

EMBL/Genbank Database under the accession number X86780. 

WO 98/01546 discloses that a PKS gene assembly 
(particularly of Type I) encodes a loading module which 
is followed by at least one extension module. The first 

10 open reading frame encodes the first multi-enzyme or 

cassette (DEBSl) which consists of three modules: the 
loading module (ery-load) and two extension modules 
(modules 1 and 2) . The loading module comprises an 
acyltransf erase and an acyl carrier protein. This may be 

15 contrasted with Figure 1 of WO 93/13663 (referred to 

above) . This shows ORF1 as only two modules, the first of 
which is in fact both the loading module and the first 
extension module. 

WO 98/01546 describes in general terms the 

20 production of a hybrid PKS gene assembly comprising a 

loading module and at least one-extension module. It also 
describes (see also Marsden, A.F.A. et al. Science (1998) 
279:199-202) construction of a hybrid PKS gene assembly 
by grafting the wide-specificity loading module for the 

25 avermectin-producing polyketide synthase onto the first 
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multi-enzyme component (DEBS1) for the erythromycin PKS 
in place of the normal loading module. Certain novel 
polyketides can be prepared using the hybrid PKS gene 
assembly, as described for example in WO 98/01571. 

5 , WO 98/01546 further describes the construction of a 

hybrid PKS gene assembly by grafting the loading module 
for the rapamycin-producing polyketide synthase onto the 
first multi-enzyme component (DEBS1) for the erythromycin 
PKS in place of the normal loading module. The loading 

10 module of the rapamycin PKS differs from the loading 

modules of DEBS and the avermectin PKS in that it 
comprises a CoA ligase domain, an enoylreductase ("ER") 
domain and an ACP, so that suitable organic acids 
including the natural starter unit 3,4- 

15 dihydroxycyclohexane carboxylic acid may be activated in 

situ on the PKS loading domain and, with or without 
reduction by the ER domain, transferred to the ACP for 
intramolecular loading of the KS of extension module 1 
(Schwecke, T. et al. Proc - Natl. Acad. Sci. USA (1995) 

20 92:7839-7843). WO 98/51695 and WO 98/49315 describe 

additional types of genetic manipulation of the DEBS 
genes that are capable of producing altered polyketides. 

The second class of PKS, named Type II PKSs, is 
represented by the synthases for aromatic compounds. Type 

25 II PKSs contain only a single set of enzymatic activities 
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for chain extension and these are re-used as appropriate 
in successive cycles (Bibb, M.J. et al. EMBO J. (1989) 
8:2727-2736; Sherman, D.H. et al. EMBO J. (1989) 8:2717- 
2725; Fernandez-Moreno, M.A. et al. J. Biol. Chem. (1992) 
5 267:19278-19290). The "extender" units for the Type II 

PKSs are usually acetate units, and the presence of 
specific cyclases dictates the preferred pathway for 
cyclisation of the completed chain into an aromatic 
product (Hutchinson, C.R. and Fujii, I. Ann. Rev. 

10 Microbiol. (1995) 49:201-238). Hybrid polyketides have 

been obtained by the introduction of cloned Type II PKS 
gene-containing DNA into another strain containing a 
different Type II PKS gene cluster, for example by 
introduction of DNA derived from the gene cluster for 

15 actinorhodin, a. blue-pigmented polyketide from 

Streptomyces coelicolor, into an anthraquinone 
polyketide-producing strain of Streptomyces gallleus 
(Bartel, P.L. et al. J. Bacteriol. (1990) 172:4816-4 826). 
The minimal number of domains required for 

20 polyketide chain extension on a Type II PKS when 

expressed in a Streptomyces coelTtcolor host cell (the 
"minimal PKS") has been defined for example in WO 
95/08548 as containing the following three polypeptides 
which are products of the actl genes: firstly KS; 

25 secondly a polypeptide termed the CLF with end-to-end 
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amino acid sequence similarity to the KS but in which the 
essential active site residue of the KS, namely a 
cysteine residue, is substituted either by a glutamine 
residue or, in the case of the PKS for a spore pigment 
5 such as the whiE gene product (Davis, N.K. and Chater, 

K.F. Mol. Microbiol. (1990) 4:1679-1691) by a glutamic 
acid residue; and finally an ACP. The CLF has been stated 
(for example in WO 95/08548) to be a factor that 
determines the chain length of the polyketide chain that 

10 is produced by the minimal PKS. However it has been found 

(Shen, B. et al. J- Am. Chem. Soc. (1995) 117:6811-6821) 
that when the CLF for the octaketide actinorhodin is used 
to replace the CLF for the decaketide tet racenomycin in 
host cells of Strepto/nyces glaucescens, the polyketide 

15 product is not found to be altered from a decaketide to 

an octaketide, so the exact role of the CLF remains 
unclear. An alternative nomenclature has been proposed in 
which KS is designated KSa and CLF is designated KSP, to 
reflect this lack of knowledge (Meurer, G. et al. 

20 Chemistry & Biology (1997) 4:433-443). The mechanism by 

which acetate starter units and^&cetate extender units 
are loaded onto the Type II PKS is not known, but it is 
speculated that the malonyl-CoA: ACP acyltransf erase of 
the fatty acid synthase of the host cell can fulfil the 

25 same function for the Type II PKS (Revill, W.P. et al. J. 
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Bacteriol. (1995) 177:394 6-3952). 

WO 95/08548 describes the replacement of 
actinorhodin PKS genes by heterologous DNA from other 
Type II PKS gene clusters, to obtain hybrid polyketides. 
5 It also describes the construction of a strain of 

Streptomyces coelicolor which substantially lacks the 
native gene cluster for actinorhodin, and the use in that 
strain of a plasmid vector pRM5 derived from the low-copy 
number vector SCP2* isolated from Streptomyces coelicolor 

10 (Bibb, M.J. and Hopwood, D.A. J. Gen. Microbiol. (1981) 

126:427-442) and in which heterologous PKS-encoding DNA 
may be expressed under the control of the divergent actl/ 
actlll promoter region of the actinorhodin gene cluster 
(Fernandez-Moreno, M.A. et al. J. Biol. Chem. (1992) 

15 267:19278-19290). The plasmid pRM5 also contains DNA from 

the actinorhodin biosynthetic gene cluster encoding the 
gene for a specific activator protein, ActII-orf4. The 
ActII-orf4- protein is required for transcription of the 
genes placed under the control of the actl/ f actlll 

20 bidirectional promoter and activates gene expression 

during the transition from growfTT to stationary phase in 
the vegetative mycelium (Hallam, S.E. et al. Gene (1988) 
74:305-320) . 

Type II clusters in Streptomyces are known to be 
25 activated by pathway-specific activator genes (Narva, 
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K.E. and Feitelson, J.S. J. Bacteriol . (1990) 172:326- 
333; Stutzman-Engwall, K.J. et al. J. Bacteriol. (1992) 
174:144-154; Fernandez-Moreno, M.A. et al. Cell (1991) 
66:769-780; Takano, E . et al. Mol . Microbiol. (1992) 
5 6:2797-2804; Gramajo, H.C. et al. Mol. Microbiol. (1993) 

7 : 837-845) . The DnrI gene product complements a mutation 
in the actII-orf4 gene of S. coelicolor, implying that 
DnrI and ActII-orf4 proteins act on similar targets. A 
gene (srmR) has .been described (EP 0 524 832 A2) that is 

10 located near the Type I PKS gene cluster for the 

macrolide polyketide spiramycin. This gene specifically 
activates the production of the macrolide antibiotic 
spiramycin, but no other examples have been found of such 
a gene. Also, no homologues of the Actll-orf 4 /Dnrl/RedD 

15 family of activators have been described that act on Type 

I PKS genes. WO 98/01546 describes the use of the Actll- 
orf4 family of activators in conjunction with their 
cognate promoters (e.g actII-orf4 with the actl promoter) 
in a heterologous actinomycete to obtain high level 

20 expression of recombinant Type I polyketide synthase 

genes. 

Although large numbers of therapeutically important 
polyketides have been identified, there remains a need to 
obtain novel polyketides that have enhanced properties or 
25 possess completely novel bioactivity. The complex 
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polyketides produced by Type I PKSs are particularly 
valuable, in that they include compounds with known 
utility as anthelmintics , insecticides, 
immunosuppressants, antifungal agents or antibacterial 
agents. Because of their structural complexity, such 
novel polyketides are not readily obtainable by total 
chemical synthesis, nor by chemical modifications of 
known polyketides. 

There is also a need to develop reliable and 
specific ways of deploying individual genes and portions 
of genes in practice so that all, or a large fraction, of 
hybrid PKS genes that are constructed, are viable and 
produce the desired polyketide product. This includes the 
development of advantageous host strains for expression 
of such genes. For example many polyketides are rendered 
bioactive by the action of further enzymes other than the 
polyketide synthase, and host strains that contain and 
are able to express the genes for such enzymes are 
particularly convenient for the efficient synthesis of 
the bioactive material. In those cases where the 
construction of a known or a no^l polyketide requires 
specialised precursors, host strains containing and able 
to express the genes for key enzymes that enhance the 
production of such specialised precursors are equally 
valuable and desirable. There is also a need to develop 



rational methods of increasing the expression level of 
all the genes required for production of a specific 
polyketide- Clearly also a host cell which is 
advantageous for the above reasons, and/or because of 
5 other favourable characteristics including but not 

limited to its speed of growth, excellent handling 
characteristics in fermentation, and ease of 
transformation with DNA by various techniques, can be 
made even more favourable by the cloning into that cell 
10 of such auxiliary genes for polyketide modification, or 

gene activation, or post-translational modification, or 
precursor supply. 

The DNA sequences have been disclosed for several 
15 Type I PKS gene clusters that govern the production of 

16-membered macrolide polyketides, including the tylosin 
PKS from Streptomyces fradiae (application EP 0 791 655 
A2), the niddamycin PKS from Streptomyces caelestls 
(Kavakas, S.J. etal. J. Bacterid. (1997) 179:7515-7522) 
20 and the spiramycin PKS from Streptomyces ambofaciens 

(application EP 0791 655 A2) . DW?t sequences have also 
been disclosed for Type I PKS gene clusters that govern 
the production of further complex polyketides, for 
example rifamycin from Amy cola tops is mediterranei (WO 
25 98/07868), and soraphen from Sorangium cellulosum (US 
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5716849), but so far no DNA sequence has been disclosed 
for one of the most widespread and important classes of 
complex polyketides, the polyethers. 

Polyethers form an important group of complex 
5 polyketide antibiotics (Westley, J-W. in "Antibiotics IV. 

Biosynthesis" (Corcoran, J.W. Ed.)/ Spr inger-Verlag, New 
York (1981) p. 41-73). They are polyoxygenated carboxylic 
acids which act as selective ionophores transporting 
cations across the cell membrane of target cells and 

10 thereby causing depolarisation and cell death. Certain 

polyethers including monensin, lasalocid and tetronasin 
are in widespread use in animal husbandry as 
coccidiostats (principally targetted against Elmeria 
spp.) and as growth promoters. Polyethers have also been 

15 reported to be active in vitro and in vivo against the 

malarial parasite Plasmodium falciparum (Gumila, C. et 
al. Antimicrobial Agents and Chemotherapy (1997) 41: 523- 
529) . 

Polyethers contain multiple asymmetric centres and 
20 are characterised by the presence of tet rahydrof uran and 

tetrahydropyran rings, producing—a characteristic shape 
which is non-polar on its outer surface and therefore 
well adapted for transport of material across bacterial 
membranes; and provides on its inner surface polar 
25 coordinating ligands for a centrally-bound metal ion. In 
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addition to tetrahydrof uran and tetrahydropyran rings, 
other groups which are often present include spiroketal, 
dispiroketal , and substituted benzoic acid moieties and 
occasionally other groups for example a tetronic acid or 
5 a 6-membered carbocyclic ring 

Monensins A and B are produced by the act inomycete 
Streptomyces cinnamonensis . Their structures are shown in 
Figure 1. Monensin B differs from monensin A only in the 
presence of a methyl sidechain at C-16 rather than an 

10 ethyl sidechain. Monensin selectively binds and 

transports sodium ions. In addition to its antibacterial 
and antifungal properties monensin has some activity 
against protozoal parasites such as the malarial parasite 
Plasmodium falciparum . Although the structures of 

15 polyethers differ significantly from those of other 

complex polyketides such as the polyhydroxylated and 
polyene macrolides, their biosynthesis appears to take 
place by a metabolic pathway which has many common 
elements. Thus experiments using carbon 14-labelled 

20 precursors have shown that monensin A is synthesised from 

five acetate, one butyrate and ^even propionate units 
(Day, L.E. et al. Antimicrob. Agents Chemother. (1973) 
4:410-414). Similarly experiments using precursors 
doubly-labelled with carbon-13 and oxygen-18 have shown 

25 that oxygens (O)l, (0)3, (0)4, (O) 5, (0)6 and (0)10 of 
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monensin arise from the carboxylate oxygens of either 
propionate or acetate, while growth in the presence of 
oxygen-18 oxygen gas demonstrated that the three 
remaining ether oxygens (0)7, (0)8 and (0)9 are derived 
5 from molecular oxygen (Cane, D.E. et al., J. Am, Chem. 

Soc. (1981) 103:5962-5965; Cane, D.E. et al. J. Am. Chem. 
Soc. (1982) 104:7274 - 7281; Ajaz, A. A. and Robinson, 
J. A. J. Chem. Soc. Chem. Commun . (1983) 12:679-680). 
These findings have been rationalised by proposing that 

10^ the biosynthesis of monensin proceeds via an acyclic 

triene intermediate (1) in which the geometry of all 
three carbon-carbon double bonds is E (entgegen) rather 
than.Z (zusamnen) . The triene is then proposed to be 
subject to epoxidation to a tri-epoxide (2) and then ring 

15 opening is proposed to occur with concomitant sequential 

formation of the five ether rings as shown in Figure 2A. 
Such a biosynthetic pathway, first mooted by Westley in 
1974 (Westley J.W. et al. , J. Antibiot. (1974) 27:597- 
604) accounts for the observed stereochemistry at the 

20 multiple asymmetric centres in monensin, (Cane, D.E. et 

al. J. Am. Chem. Soc. (1982) 102TT7274-7281; Sood, G.R. et 
al. J. Chem. Soc. Chem. Commun. (1984) 21:1421-1424) and 
analogous schemes can be used to account for the 
biosynthesis of other known polyethers. such as lasalocid 

25 A (Hutchinson C.R. et al., J. Am. Chem. Soc. (1981) 
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103:5953-5956), tetronasin (ICI 139603) < Demetriadou , 
A.K. et al. J. Chem. Soc. Chem. Commun. (1985) 7:408-410) 
and narasin (Spavold, Z. et al . Tetrahedron Letters 
(1986) 27:3299-3302) . The hydroxylat ion at C-26 and the 
5 introduction of an O-methyl group on oxygen 3 -are 

proposed to occur as late steps in the biosynthesis, 
after formation of the polyether structure. 

Unfortunately key aspects of the biosynthetic scheme 
shown in Figure 2A have so far eluded experimental 

10 confirmation. No biosynthetic intermediates have been 

isolated from mutants of S. cinnamonensis that are 
blocked in early stages of monensin production. 26- 
deoxymonensin A has been isolated from a S. cinnamonensls 
mutant partially blocked in monensin production 

15 (Ashworth, D.M. et al . J. Antibiot. (1989) 42:1088-1099) 

and 3-0-demethylmonensins A and B have been recovered as 
minor components from the fermentation broth of a 
monensin-producing strain (Pospisil, S. et al. J, 
Antibiot. (1987) 40:555-557). When fed to cells of S. 

20 cinnamonensis in radio-labelled form, neither 

2 6-deoxymonensin A, nor 3-0-demeThylmonensin A, nor 3-0- 
demethyl, 26-deoxymonensin A were significantly 
incorporated into monensin A (Ashworth, D.M. et al. J. 
Antibiot. (1989)- 42:1088-1099), either because they are 

25 actively excluded or because these modifications in fact 
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occur earlier in the biosynthetic pathway so that these 
metabolites, are shunt products not readily converted into 
the final antibiotic by the respective hydroxylase or 
methyltransf erase . Similarly, the putative all (E)-triene 
5 precursor (1) has been synthesised and shown not to 

become incorporated into monensin when fed to growing 
cells of S. cinnamonensis (Holmes, D.S. et al. Helv. 
Chim. Acta (1990) 73:239-259). An alternative pathway has 
been proposed, as shown in Fig 2B, based on the 

10 transition-metal-mediated oxidation of 1,5-dienes (Walba, 

D.M. and Edwards, P.D. Tetrahedron Lett. (1980) 21:3531- 
3534). The triene intermediate (4) would different from 
that of Figure 2A (1) only in that each carbon-carbon 
double bond would have the ( Z) -configuration (Townsend, 

15 C.A. and Basak, A. Tetrahedron (1991) 47:2591-2 602) and 

not the (E)- configuration. 

The genetic basis of secondary metabolite 
biosynthesis essentially exists in the genes which code 
for the individual biosynthetic enzymes and in the 

20 regulatory elements which control the expression of the 

biosynthetic genes. The genes encoding biosynthesis of 
polyketides in act inomycetes have hitherto been found as 
clusters of adjacent genes, ranging in size from 
20 kilobasepairs (kbp) to over 100 kbp. The clusters 

25 often contain specific regulatory genes and genes 
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conferring resistance of the producing strain to its own 
antibiotic . 

In various of its aspects the invention provides the 
following: - 

5 (1) a DNA sequence encoding at least one-peptide 

necessary for the biosynthesis of monensin, preferably 
comprising one or more of the following genes: mon BI , 
mon BII f mon CI, mon CII , mon H, mon RI , mon RII, mon T, 
mon AIX and jnon AX as depicted in the appended sequence 
10 data or an allele or mutation thereof; 

(2) a DNA sequence according to the first aspect 
comprising all of the genes listed therein or an allele 
or mutation thereof; 

(3) a DNA sequence according to the first aspect 
15 comprising the complete monensin gene cluster; 

(4) a DNA sequence coding for one or more of the 
peptides set out below, said peptide having the amino 
acid sequence as set out in the appended sequence data or 
being a variant thereof having the specified activity: 

20 - peptide activity 

mon CII epoxyhydrolase/cyclase^- 

mon E S-adenosylmethionine-dependent methyltransf erase 

mon T monensin resistance gene 

mon RII repressor protein 
25 won AIX thioesterase 
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mon 


71 T 


polyket ide 


synthase multienzyme 


mon 


All 


polyketide 


synthase multienzyme 


mon 


AIII 


polyketide 


synthase multienzyme 


mon 


AIV 


polyketide 


synthase multienzyme 


mon 


AVI 


polyketide 


synthase multienzyme 


mon 


AVII 


polyketide 


synthase multienzyme 


mon 


AVIII 


polyketide 


synthase multienzyme 


mon 


H 


regulatory 


protein 


mon 


CI 


flavin-dependent epoxidase 


mon 


BII 


carbon-carbon double bond isomerase 


mon 


BI 


carbon-carbon double bond isomerase 


mon 


D 


cytochrome 


P4 5G hydroxylase 


mon 


RI 


activator protein 


mon 


AX 


thioesterase 



(5) a recombinant cloning or expression vector 
comprising a DNA sequence according to any of aspects 1-4; 

(6) a transformant host cell which has been 
transformed to contain a DNA sequence according to any of 
aspects 1-4 and is capable of expressing a corresponding 
peptide; 

(7) a hybridization probe comprising a polynucleotide 
which binds specifically to a region of the monensin gene 
cluster selected from mon BI , mon BII,, mon CI, mon CII , 
mon H, mon RI , mon RII, mon T mon AIX and mon AX; 



(8) use of a probe according to aspect (7) in a 
method of detecting the presence of a gene cluster which 
governs the synthesis of a polyether, and optionally 
isolating. a gene cluster detected thereby; 
5 (9) Use of a probe comprising a polynucleotide which 

binds specifically to a gene responsible for levels of 
activity of the monensin gene cluster, preferably a 
regulatory gene, resistance gene or thioesterase gene, 
more preferably the regulatory gene mon RI, in a method of 
10 detecting an analogous gene in a gene cluster of another 

polyketide, preferably a polyether, and optionally 
manipulating the gene detected thereby to alter the level 
of expression of said other polyketide; 

(10) a host cell, preferably Streptomyces 

15 cinnamonensis , containing a heterologous gene under the 

control of the mon RI gene and a monensin promoter; 

(11) use of a portion of the monensin gene cluster 
having chain terminating activity, preferably comprising 
at least one of mon AIX and mon AX or a mutant or allele 

20 thereof having chain terminating activity, to effect chain 

release of a peptide other than^me required for monensin 
biosynthesis ; 

(12) use of a portion of the monensin gene cluster 
having carbon-carbon double bond isomerase activity, 

25 preferably comprising at least one of mon BI and mon BII 
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or a mutant or allele thereof having isomerase activity to 
provide a desired stereochemical outcome in the synthesis 
of a polyketide other than monensin; 

(13) a polypeptide encoded by a portion of the 

5 monensin gene cluster, preferably comprising at least one 

of mon BI and mon BIT or a mutant or allele thereof, 
having carbon-carbon double bond isomerase activity; 

(14) an epoxidase enzyme encoded by mon CI or a 
derivative or variant thereof having epoxidase activity; 

10 (15) a cyclase enzyme encoded by mon CII or a 

derivative or variant thereof having cyclase activity . 

Some embodiments of the invention will now be 
described by way of example with reference to the 
accompanying drawings in which : 
15 Fig 1 shows the structure of monensins A and B; 

Fig 2 illustrates proposed biosynthetic pathways; 
Fig 3 illustrates the proposed organization of the 
monensin polyketide synthase (PKS) enzyme complex; and 
Fig 4 illustrates the proposed organization of the 
20 monensin biosynthetic gene cluster. 

The overall gene organization of the monensin 
biosynthetic gene cluster, as shown in Fig 4, is similar 
to that previously found for many macrolide biosynthetic 
gene clusters, which have one or more open reading frames 
25 (ORFs) encoding large multifunctional PKSs flanked by 
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other genes which encode functions required for the 
biosynthesis of the antibiotic. In' the case of monensin, 
there is an unusually high number of distinct ORFs 
encoding PKS mult i -enzymes (eight in total, labelled monAI 
5 to monAVIII) but there is again a separate module of 

enzymes for each cycle of polyketide chain extension, 
exactly as found for modular PKSs for macrolide 
biosynthesis (see Fig 3) . Thus there are 12 condensations 
predicted to be required for the production of the carbon 

10 skeleton of monensin, and in agreement with this there are 

found to be 12 extension modules of PKS enzymes 
distributed among the 8 PKS ORFs. However, as mentioned in 
detail below, the other genes in the monensin cluster 
include genes which have not previously been found in any 

15 other gene cluster for the biosynthesis of a complex 

polyketide, and which are not significantly similar to any 
genes in published sequence databases. The cloned DNA for 
these genes is useful to allow the diagnosis that a 
polyketide biosynthetic gene cluster in any actinomycete, 

20 uncovered previously by conventional hybridization against 

a PKS gene probe from (say) the^EBS or some other 
characterised PKS gene cluster, is one that governs the 
synthesis of a polyether; and these genes are also 
valuable either singly or in combination as specific 

25 hybridization probes for the specific detection and 
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isolation of additional polyether biosynthet ic gene 
clusters. Examples of these previously-unknown genes are 
the genes monBI , monBII, monCI and monCII. In addition the 
regulatory genes monH monRI, and monRII and the resistance 
5 gene monT and the thioesterase genes monAIX and monAX are 

all useful for the detection of analogous genes in other 
polyether clusters which are required for the rational 
manipulation of such genes in order to increase levels of 
the specific product. 

10 The cloned and sequenced cluster of genes for. 

monensin biosynthesis is useful secondly in the 
engineering of mutant strains of 5. cinnamonensis and of 
other actinomycetes which are suitable strains for the 
high level production of either natural or novel 

15 recombinant polyketides. The sequence of the monensin 

cluster disclosed here shows the surprising fact, that the 
gene cluster contains a gene monRI whose gene product has 
an amino acid sequence highly similar to that of actll- 
orf4, the pathway-specific activator gene which activates 

20 the actl and other promoters of the actinorhodin 

biosynthetic gene cluster of Streptomyces coelicolor. The 
recognition of this aspect of the natural regulation of a 
Type I PKS cluster is important and valuable because 
first, it is possible to increase the yield of monensin by 

25 increasing the level of the activator MonRI, either by 
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placing the gene monRI under the control of a powerful 
promoter or arranging for the presence within the cells of 
one or more additional copies of the monRI gene (as 
exemplified below) ; secondly, it will be possible to use 
5 the monRI gene as a specific hybridisation probe to locate 

similar genes in other complex PKS gene clusters, 
especially other polyether PKS gene clusters but also 
polyene and macrolide gene clusters and all other Type I 
modular PKS gene clusters; even in cases where (as for 

10 rapamycin and erythromycin) no such gene has been 

previously found within the currently accepted physical 
limits of the relevant biosynthetic gene cluster. In such 
cases the monRI gene probe might be expected to uncover 
the activator even if it resides on the chromosome at some 

15 distance from the main body of the gene cluster; and 

simple experiments would then show whether the 
activator (s) so uncovered are involved in regulation of 
the biosynthesis of those particular metabolites; thirdly, 
increasing the copy number of the monRI gene or of any of 

20 the activator genes uncovered will tend to increase the 

yield of a heterologous polyketiTte by "crosstalk" where 
the activator mimics the presence of the normal activator 
for the transcription of the genes for that heterologous 
polyketide synthase. It is clear from recently published 

25 work (Wietzorrek, A. and Bibb, M. Mol . Microbiol. (1997) 
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25:1181-1184) that the ActII-orf4 family of activators 
exert their effects by binding to promoter regions within 
the target gene cluster, so it will be possible to use the 
monRI gene together with monensin promoter regions to 
5 drive the high-level transcription and translation of 

heterologous genes in Streptomyces cinnamonensis T and 
perhaps in other host strains too; such genes need not be 
PKS genes or even involved in polyketide biosynthesis. 
Monensin promoter regions are found at the 5' end of genes 

10 or groups of genes in the cluster and their location is 

clear from the sequence analysis disclosed here. Thus a 
useful vector would provide the monensin promoter and the 
ribosome binding site and continue up to the start of the 
open reading frame, after which the monensin ORF naturally 

15 found there would be replaced by the heterologous gene. 

The relative strength of the monensin promoters can be 
readily determined using any one of a number of known 
promoter probes, i.e. genes whose expression gives rise to 
readily measurable and quantifiable effects, such as Green 

20 Fluorescent Protein (GFP) ; or beta-galactosidase in the 

presence of a chromogenic substrate. It should be possible 
to mutate randomly the small region of the monensin 
promoters especially likely to interact with the MonRI 
activator (identified by the presence of tandem 

25 heptanucleotide repeats with a common consensus sequence 
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between the various monensin promoters) (Wiet zorrek, A. 
and Bibb, M . Mol. Microbiol. (1997) 25:1181-1184), and to 
determine the optimal DNA sequence for the maximal 
activation effect using either S. cinnamonensis 
(preferably - in case there are other unknown' factors that 
make the activation function better in this strain than in 
other heterologous systems) , or even in another host 
actinomycete strain. If the natural monensin promoters 
were mutated to have this optimal recognition sequence, 
then this would further increase the production of 
monensin. By extension, the use of this modified monensin 
promoter in conjunction with the monRI gene in 
heterologous systems could form the basis of further 
improvements in expression of polyketide synthases or 
other genes, either by appropriate chromosomal alterations 
to introduce the altered promoter and also the monRI gene; 
or by provision of vectors containing these optimised 
signals linked to specific genes and housed in suitable 
host cells. 

The sequencing of the monensin cluster has uncovered 
another strategy for gene regulation in such Type I 
clusters. The previously-sequenced genes for the rapamycin 
biosynthetic pathway in Streptomyces hygroscopicus 
included a gene of unknown function {rapH) . A closely 
similar gene has now been found in the monensin 



" "Tjt^Bf o'tfytr^l) 7 2 



biosynthetic gene cluster (monH) , and it is clear from 
this recurrence (and the comparison of the sequences with 
those of database proteins) that this gene is potentially 
an important DNA-binding sensor gene which acts to 
regulate the transcription of the cluster in concert with 
other regulatory signals. Simple experimentation is needed 
in order to define whether the gene is an activator, in 
which case putting in another copy or increasing its 
transcription will have the potential to increase 
polyketide biosynthesis; or alternatively the rapH gene 
product may be a negative regulator, whereupon deletion of 
this gene may release the biosynthetic pathway from this 
inhibitory effect and increase yields. 

There is a continuing need to develop new methods of 
high-level production of bioactive metabolites and other 
valuable gene products in actinomycetes. Streptomyces 
cinnamonensis is a recognised and very valuable industrial 
strain for the production of very high levels of monensin, 
it is readily transformable with DNA by standard methods 
of conjugation or of protoplast transformation, it is a 
host for numerous known broad range plasmids including 
well-known expression plasmids of both high- and low-copy 
number, it also grows quickly relative to other 
actinomycete strains (for example about three times faster 
than wild type Saccharopolyspora erythraea the 



erythromycin producer, under comparable conditions) and 
sporulates relatively easily. Heterologous polyketides can 
be expressed in Streptomyces clnnamonensis using for 
example the low-copy number plasmid pCJR24 (which has no 
5 origin of replication active in act inomycetes~ so is 

maintained by integration into the chromosome) (Rowe, C. 
et al. Gene (1998) 216:215-223) or the related plasmid 
pCJR29 in which the polyketide synthase gene(s) are placed 
under the control of the actl promoter which is activated 
10 by the ActII-orf4 activator; or alternatively the monAI 

promoter can be substituted together with the MonRI 
activator; or some other pairing of activator and cognate 
promoter chosen from either a Type II or a Type I 
polyketide synthase gene cluster. As an example, the wild 
15 type strain of Streptomyces cinnamonensis has been used to 

express the plasmid pCJR29 (Rowe, C. et al. Gene (1998) 
216:215-223) containing as insert the three ORFs for the 
PKS governing the production of 6-deoxyerythronolide B, 
the macrolide precursor of erythromycin A in 
Saccharopolyspora erythraea, these genes being placed 
under the control of the pathway^specif ic act I promoter 
from Streptomyces coelicolor together with its cognate 
activator gene actJJ-orf4. The transformed strain when 
cultivated in a suitable liquid medium produced 6- 
25 deoxyerythronolide B in good yield. 
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It is well known to the person skilled in the art 
that it is possible to use standard vectors unable to 
replicate in act inomycetes to introduce DNA into a 
Streptomyces cell, such DNA comprising two portions of 
5 contiguous DNA which are each identical to one of two 

portions of the cell's chromosome that are spaced up to 
100 kbp apart; and that through recombination between the 
incoming DNA and the chromosome occurring in both portions 
of DNA the net result is that the chromosomal sequence is 

10 replaced by the defective sequence originally that of the 

incoming DNA. Such a procedure has been applied to the 
monensin-producing strain of S. cinnamonensis as described 
in detail below, and a strain of S. clnnamonensis has been 
obtained that carries a specific deletion in the monensin 

15 cluster and which is unable to produce the antibiotic. The 

use of such a strain facilitates the production of 
heterologous polyketides by removal of the background of 
monensin production. 

The multiple uses of portions of the cloned and 

2D sequenced DNA from the monensin cluster will readily occur 

to the person skilled in the arrr A surprising feature of 
the PKS of the monensin cluster is an unusual mechanism of 
polyketide chain initiation. We have found that the 
monensin PKS loading module has three domains, which from 

25 the amino-terminus of the protein are: a KSq domain, an 
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acyltransf erase domain and an ACP domain. We have 
uncovered this organisation in the PKS for the 14 -membered 
macrolide oleandomycin as well as in the monensin PKS, an 
organisation of the loading module previously only found 
5 for the 16-membered macrolides and in which the KSq domain 

(which looks like a ketosynthase or condensation domain 
except that the active site cysteine residue is 
substituted by a glutamine for which the single letter 
notation is Q) had been previously speculated to have no 

10 function. It was realised that the acyltransf erase of the 

loading module actually has malonyl-CoA and not acetyl-CoA 
as a substrate and that KSq is an active decarboxylase. It 
appears that a better discrimination can be achieved in 
the selection of the smaller acetate unit over propionate 

15 if the choice is made initially between methylmalonyl- and 

malonyl-CoA. 

An unprecedented feature of the monensin PKS genes is 
that no integral chain-terminating domain is present as a 
C-terminal appendage of the PKS extension module that 

20 catalyzes the twelfth and final chain extension. Because 

the product of the monensin PKS ^rs a carboxylic acid, it 
would have been firmly predicted that chain release would 
have been catalyzed by such a C-terminal domain containing 
a "thioesterase" activity. Previously sequenced PKS gene 

25 sets have been of two sorts: first, those macrolide PKSs 



- 30 - 



PCT/GB 0 0 / 0 2 0 7 2 



typified by erythromycin, spiramycin, tylosin, niddamycin 
which have a readily recognisable C-terminal 
"thioesterase" domain, which in these enzymes functions as 
a specific cyclase rather than releasing the polyketide 
5 product as a free carboxylic acid; secondly, "those 

macrolide PKSs typified by rapamycin, FK506, and 
rifamycin, where there is an alternative and recognised 
mode of chain termination by transfer of the polyketide 
chain to an acceptor moiety, catalyzed by a specific 

10 enzyme (eg pipecolate incorporating enzyme for rapamycin 

(Schwecke T. et al. Proc. Natl. Acad. Sci . USA (1995) 
92:7839-7843) and FK506 (Mothamedi H. and Shafiee A, Eur. 
J. Biochemistry (1998) 256:528-534); arylamine synthetase 
for rifamycin (August P.R. et ai. Chemistry & Biology 

15 (1998) 5:69-79). 

The monensin PKS surprisingly falls into neither 
category, and therefore seems to be the first example of a 
novel mode of chain termination. It is novel and 
noteworthy in this connection that the monensin PKS gene 

20 cluster contains two small genes that encode discrete, 

monof unctional thioesterase enzylTTes. Although many PKS 
gene clusters have been previously shown to contain one 
such discrete thioesterase, none have been shown to have 
two. The role of such thioesterases is not known, although 

25 in the case of methymycin/pikromycin PKS, which has been 
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reported to be responsible for the biosynthesis of both 
the 12-membered macrolide methymycin and the 14-membered 
macrolide pikromycin (Xue Y.Q. Proc. Natl. Acad. Sci. USA 
(1998) 95:12111-12116) the disruption of this thioesterase 
5 reportedly caused a ten-fold drop in the amount of both 

macrolides produced* A similar finding has been reported 
for the discrete thioesterase of the tylosin PKS gene 
cluster (Cundliffe E. et al. Chemistry & Biology in 
press) . Additional copies of such thioesterases may 

10 therefore accelerate the production of specific 

polyketide, but this has not yet been demonstrated. 
However, the presence of the discrete thioesterase is not 
completely essential for polyketide production. 

It is highly desirable to have a broadly effective 

15 method of catalysing the release of polyketide gene 

products from a PKS as the free acid. The well-studied 
integral thioesterase domain in the erythromycin PKS 
thioesterase has a broad specificity in cyclization to 
form a lactone (assuming that a hydroxy group is present 

20 in the growing polyketide chain at an appropriate 

position) , but hydrolysis to form- the free acid is very 
slow. The recognition of the unusual arrangement of the 
monensin PKS means that it is now possible to harness 
either the entire PKS module that catalyses the twelfth 

25 and final extension cycle in monensin biosynthesis, or the 
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C-terminal portion of it, and graft it onto a different 
polyketide synthase by genetic engineering, so as to allow 
the release mechanism characteristic of monensin to 
operate in a different context. The use of this portion 
5 only of the monensin PKS suffices to allow the novel 

mechanism of chain release to operate successfully. The 
speed of the polyketide chain hydrolysis in a given case 
can depend on the additional presence of one or both of 
the discrete thioesterase genes {monAIX and monAX) from 

10 the monensin gene cluster. The use of this novel method of 

chain termination represents a valuable way of generating 
a large number of novel engineered polyketides that are 
currently inaccessible, and ensuring that the products 
have a specified chain length . 

15 The genes monBI and monBII appear to encode very 

similar enzymes with significant amino acid sequence 
similarity to authentic ketosteroid isomerases which are 
known to catalyse the migration of an activated carbon- 
carbon double bond. The conservation of active site 

20 residues makes it very likely that these mon genes govern 

a reaction involving activated 4eaible bonds in the 
biosynthetic pathway to monensin and this surprising 
observation can be accommodated if the initial product of 
the polyketide chain growth on the monensin PKS is a 

25 linear precursor in which the double bonds were initially 
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formed with a conventional trans or E (entgegen) geometry; 
but before the poiyketide chain was extended by insertion 
of the next unit the monBI and/or the monBII gene 
product (s) catalyse the specific rearrangement of the 
5 newly-created double bond into the cis or Z (zusammen) 

geometry- This new view of the monensin biosynthetic 
pathway allows the deduction that the monBI and monBII 
genes, perhaps in combination with specific portions of 
the monensin modules where they normally exert their 

10 effects (namely modules 3, 5 and 7) might be used in order 

to achieve the extremely desirable targetted biosynthesis 
of novel polyketides containing double bonds with Z 
geometry at specified point (s) along the chain. Thus for 
example it should be possible to provide for the direct 

15 biosynthesis of C22-C23 cis or Z double bond in 

avermectins, thus avoiding tedious and expensive chemical 
conversion of an initial fermentation product into this 
important antihelminthic. Only limited experimentation is 
needed to see whether the monBI and/or monBII gene 

20 products are sufficient or whether the mon PKS at modules 

3, 5 and 7 forms part of the specific docking site (s) for 
the isomerases and therefore must also be used in the 
creation of the hybrid PKS that will insert the cis or Z 
double bond at the desired position. The substrate 

25 specificity of the isomerases need not be limited to 2,3- 
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unsaturated thioesters. The purified enzymes could also be 
used to effect such isomerisations in vitro, depending on 
the position of the equilibrium or whether further enzymes 
are used to achieve the further transformation of the 
product as it is formed {vide infra) . 

The product of the monCI gene is a novel oxidative 
enzyme with some sequence similarity to authentic examples 
of such enzymes in the databases; and with a clearly 
definable role in the monensin biosynthetic pathway, the 
epoxidation of the double bonds at three separate 
positions in the initially-formed acyclic intermediate in 
monensin biosynthesis. This epoxidase could therefore be 
used in conjunction with monBI/monBII gene products to 
effect oxidative reactions on suitable substrates in vitro 
and in vivo. Similarly the monCII gene product is a 
putative cyclase that opens the epoxides and causes the 
formation of ether rings in monensin. 

Any or all of the monBI , monBII, monCI or monCII 
genes may be introduced into a heterologous strain 
containing the gene cluster for another polyether, in 
order to divert the biosyntheticTpathway and produce a 
polyketide of altered structure- In these experiments the 
analogues of these monB genes could either be present or 
(once located and characterised using the mon genes as 
probes) they may be deleted prior to the introduction of 



the monB and monC genes into that strain. The converse 
experiment in which analogues of the monB and monC genes 
from other strains are introduced into S. cinnamonensis 
likewise has the potential to produce novel oxidised 
5 polyketides . Also, the monB and monC genes or- their 

analogues may be introduced into a strain that normally 
produces a macrolide or a polyene or some other complex 
polyketide and expressed there, when they may effect the 
diversion of the growing polyketide chain on a 
10 heterologous modular PKS towards a new product, which may 

or may not have the structure of a polyether. 

The availability of the monensin gene sequence allows 
the institution of domain swaps to alter the 

is acyltransferase (AT) specificity of a given module, for 

example the ethylmalonyl-CoA specific extender found in 
one of the modules of the monensin PKS can be used to 
replace one of the other ATs to generate an ethyl side 
branch at that position in the chain, or the AT can be 

20 used to substitute in any other (e.g. macrolide) PKS, as 

described in WO 98/01571 and WO-&8/01546. Similarly the 
alteration of the level of reduction in a module, by 
manipulation of the reductive enzymes, can be applied to 
the monensin genes and here it will produce, depending on 

25 which module is affected, either an altered monensin, or a 
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species which is only partly cyclised, or a polyether with 
an altered pattern of cyclisation, or even a linear 
polyketide. 

In general the targetted alteration of the pattern of 
5 substitution of sidechains or reduction level- along the 

polyketide chain produced by the monensin PKS will, like 
the disruption or deletion of the oxidative enzymes 
mentioned above, lead to non-polyether polyketide 
products. It should be possible, by introduction of the 

10 DEBS thioesterase at the C-terminus of one of the later 

modules of the monensin PKS, together with an 
appropriately placed hydroxy group earlier in the chain, 
to produce novel macrolide products from this polyether 
PKS system, or alternatively novel polyenes of defined 

15 chain length and chosen ring size. 
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Example 1 

Cloning of the monensin A biosvnthetic gene cluster using 
DNA probes derived from the ervthr omycin-producing 
polvketide synthase of SaccharonolvsjDora ervthraea 
5 A genomic library of the monensin A producing strain 

Streptomyces cinnamonensis ATCC 15413 was constructed 
using methods well-known in the art, namely, the 
production of high molecular weight genomic DNA, followed 
by the partial cleavage of this DNA using the frequent- 

10 cutting restriction enzyme Sau3A, fractionation of the 

fragments on a sucrose gradient and selection of fragments 
of average size 35-40 kbp, and the cloning of these 
fragments into the cosmid vector pWE15 (Evans, G.A. et al. 
Gene (1989) 79:9-20) which had been previously digested 

15 with BamHl and treated with shrimp alkaline phosphatase. 

The library was packaged and transfected into Escherichia 
coli XL-1 Blue MR cells. The library was plated out on 
2xTY agar medium (10 g tryptone, 10 g yeast extract, 5 g 
NaCl, 15 g bactoagar per litre containing ampicillin 50 

20 ^g/ml) for cosmid selection and the colonies were allowed 

to grow overnight. The library was then screened by 
hybridisation using as a probe DNA encoding the 
ketosynthase domain of module 1 of the erythromycin- 
producing PKS ( 6-deoxyerythronolide B synthase, DEBS) of 

25 Saccharopolyspora erythraea. The colonies giving a 
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positive hybridisation signal in the hybridisation were 
selected and the cosmid DNA from each colony was purified 
and mapped by restriction digestion. The presence of the 
target biosynthetic genes on a cosmid was verified by 
5 sequencing of the ends of the cosmid inserts using the 

commercially available T3 and T7 primers which hybridise 
specifically to the respective ends of each cosmid insert 
(Evans, G.A. et al. Gene (1989) 79:9-20). 
Example 2 

L0 Sequencing of the biosynthetic gene clust er f or monensin A 

from Streotomvces cinnamonensis 

Three cosmids obtained by screening of the genomic 
library of S. cinnamonensis were used to obtain the entire 
DNA sequence of the monensin biosynthetic gene cluster - 

15 These cosmids, MO.CN02, MO.CNll and MO.CN33 between them 

contain the entire DNA sequence of the cluster and the 
adjacent regions of the chromosome. They have been 
deposited in NCIMB, 23 St Machair Drive/ Aberdeen AB24 
3RY, UK, under the NCIMB accession numbers 40956 

20 (MO-CN11) ; 40957 (MO-CN33) and 40958 (MO-CN02) 

respectively. — »- 

The DNA of each cosmid was separately subjected to 
partial digestion with Sau3A and fragments of 
approximately 1.5-2.0 kbp were separated by agarose gel 

25 electrophoresis. The fragments were then ligated into the 
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plasmid vector pUC18 (Messing, 1982), previously digested 
with BamWl and treated with shrimp alkaline phosphatase. 
The library was transformed into E. coll strain XLl-Blue 
MR and plated on 2xTY agar medium containing ampicillin 
5 (100 pg/ml) to select for plasmid-containing -cells. 

Plasmid DNA was purified from individual colonies and 
sequenced using the Sanger dye-terminator procedure on an 
ABI 377 automated sequencer (Sanger, F. Science (1981) 
214:1205-1210). The sequence data obtained from single 

10 random subclones of a cosmid was assembled into a single 

continuous sequence and edited using GAP4 . 1 program of the 
STADEN gene analysis package (Staden r R . Molecular 
Biotechnology (1996) 5:233-241). 

The sequence is set out in the appended sequence 

15 listing. 

Tables I and II contain data about individual genes 
and gene products. 
Example 3 

Inactivation of the monensin A biosynthetic gene cluster 
20 A chromosomal gene disruption experiment was used to 

verify the identity of the clonejdl polyketide synthase gene 
cluster. Plasmid pMOB6314 is a pUC18 sequencing subclone 
of the presumed monensin A biosynthetic gene cluster 
prepared as described in Example 1, whose inserted DNA 
25 comprises the DNA sequence from nucleotide 9763 to 
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nucleotide 10108 in SEQ ID 1, and which therefore contains 
a region of DNA wholly internal to orfE, a putative 3-0- 
methyltransf erase . A Hindlll fragment containing the 
thiostrepton resistance gene tsr from plasmid pIJ702 
5 (Katz, E. et al. J. Gen. Microbiol. (1983) 129:2703-2714) 

was cloned into the ffindlll site of plasmid pMOB6314 and 
the ligation mixture was used to transform E. coll cells. 
Transf ormants bearing the required plasmid pMOAEOl were 
identified by isolation of plasmid DNA and analysis by 

10 restriction digestion. pMOAEOl. Plasmid pMOAEOl was used 

to transform protoplasts of Streptomyces clnnamonensis as 
described by (Hopwood D.A. et al. (1985)). Since plasmid 
pMOAEOl lacks an origin of replication that is active in 
Streptomyces, growth in the presence of thiostrepton (25 

15 pg/ml) in the regeneration medium led to the isolation of 

stable integrants. Isolated putative integrants were 
tested for the presence of integrated pMOAEOl sequences by 
Southern hybridisation. A clone of Streptomyces 
clnnamonensis identified by its restriction pattern in 

20 Southern hybridisation as bearing pMOAEOl integrated in 

the region of monE of the monensin A biosynthetic gene 
cluster was designated 5. clnnamonensis MO-DD01. 

Detection of production of the monensin A related 
metabolites produced by S. clnnamonensis MO-DD01 was 

25 performed by GC-MS analysis of methanol extracts of the 



entire broth , harvested in 72 hours of growth of the 
strain. No significant amounts of monensin A-relateci 
metabolite production were detectable. 
Example 4 

5 Overproduction of erythromycin aalvc one in StTreptomvces 

cinnamon&nsis 

S. cinnamonensis is a suitable system for 
overproduction not just of monensin A but also of other 
polyketide metabolites. Established techniques of genetic 
10 transformation allow fast introduction of foreign 

polyketide producing genes sets into this host . Fast 
growth of S- cinnamonensis in liquid culture and optimal 
precursor supply favour high yield of polyketide 
metabolites . 
is Construction of pIB061 

S. erythraea NRRL2338 was transformed with pCJR30 
(Rowe, C. J., et al. (1998) Gene 216:215-223) using a 
routine protoplast transformation technique as described 
by Hopwood et al . (1985). A stable integrant of S. 
20 erythraea [pCJR30] was identified and the production of 

lOmg/L of the triketide lactone^Tdelta lactone of 
(2S, 3R, 4R, 5R) -2 , 4 -dimethyl-3 , 5-dihydroxy-heptanoic acid) 
in addition to erythromycins was confirmed by MS 
analysis . 

25 Total DNA of S. erythraea [pCJR30] was purified and 
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approximately 200 ng was digested with EcoRI endonuclease . 
The digestion mixture was precipitated with isopropanol 
and the resulting DNA was treated with T4 DNA-ligase for 
16 hours at 16°C. The ligation mixture was used to 
5 transform E.coli DH10B cells. The transf ormants were 

screened for the presence of the plasmid. A clone 
containing a 44 . 7kb plasmid was identified and confirmed 
by restriction analysis to contain three complete genes: 
eryAI, eryAII and eryAIII. The plasmid was named pIB061. 

10 Transformation of S. cinn am onen sis 

Protoplasts of S. cinnamonensis were prepared by a 
modified procedure of Hopwood et al. (1985) . Plasmid 
pIB061 was transformed into the protoplasts of S. 
cinnamonensis and stable thiostrepton resistant colonies 

15 were isolated. Individual colonies were checked for their 

plasmid content and the presence of plasmid pIB061 was 
confirmed by its restriction pattern. S. cinnamonensis 
(pIB061) was inoculated into 250 ml of M-C3 minimal 
production medium containing 10 jug/ml of thiostrepton and 

20 allowed to grow for 72 hours at 30 °C. After this time the 

mycelia were removed by filtering. The broth was extracted 
with two volumes of ethyl acetate and the combined ethyl 
acetate extracts were washed with an equal volume of 
saturated sodium chloride, dried over anhydrous sodium 

25 sulphate, and the ethyl acetate was removed under reduced 



pressure to give about 200 mg of crude product. The 
product was analysed by LCQ and mass was confirmed to that 
of erythronolide B. 

This example demonstrates the importance of S. 
5 cinnamonensls for production of high levels of foreign 

polyketide antibiotics. Introduction of the complete 
erythromycin gene cluster or other gene clusters into this 
system are likely to produce high levels of the 
corresponding metabolites. 

10 Example 5 

Construction of plasmid pCJW58 containing the monensin 
activator gene under the ermE* promoter 

The ermE* promoter derived from the ermE resistance 
methyltransf erase gene of S. erythraea (Bibb et al. Gene 

15 (198S) 38:215-226) was amplified by PCR as a Spel-Xbal 

fragment using the following oligonucleotides 
5 1 -CCACTAGTATGCATGCGAGTGTCCGTTCGAGT-3 ' and 5'- 
TTGTATACACCTAGGATGGTTGGCCGTGC- 3 ' with pRH3 (Dhillon et al. 
Molecular Microbiology (1989) 3:1405-1414 as a template 

20 and cloned into Smal-digested, phosphatase-treated pUC18, 

to produce plasmid pIB135. The integrative plasmid pSET152 
(Bierman, M. et al . (1992) Gene 116:43-49)) was digested 
with Xbal and the backbone was dephosphorylated and 
ligated to the Spel-Xbal fragment of pIB135 containing the 

25 ermE* promoter. The ligation mixture was used to 
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transform £. coll DH10B and the orientation of the insert 
in the plasmids from individual clones was checked by 
using restriction analysis. A plasmid with the ermE* 
promoter oriented so that the Ndel and Xbal sites are 
5 adjacent to the apramycin resistance gene was- selected and 

named pIB139. 

The monR gene from the monensin biosynthetic gene 
cluster was amplified and Ndel and Xbal restriction sites 
introduced at 5' and 3' ends respectively, by PCR using as 
10 primers the following oligonucleotides: 

5'-AGA TAG CAT ATG CTG GGC CCG CTC CGC AT -3' 
and 5'-AAT GCT CTA GAC TGT CAG CGA CCG GAC AGG GCC AA-3' 
and cosmid M0.CN11 as template. The PCR product was 
ligated into Smal-treated and phosphatase-t reated plasmid 
15 pUC18 and the ligation mixture was used to transform E. 

coll DH10B cells. Transformant colonies were analysed for 
the presence of plasmid and the identity of the plasmid 
inserts was verified by sequencing. A plasmid whose 
insert contained the monR gene flanked by Ndel and Xbal 
20 restriction sites was selected and designated pCJW57 . 

Plasmid pCJW57 was digestecT*with Ndel and Xbal and 
the fragment containing the monR gene was ligated together 
with the backbone of plasmid pIB139 which had been 
digested with the same two restriction enzymes, and 
25 purified by gel elution. The ligation mixture was used to 
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transform E. coli strain DH10B cells. Transf ormant 
colonies were analysed for the presence of plasmid and the 
identity of the plasmid inserts was verified by 
restriction analysis. One such recombinant was selected 
and named plasmid pCJW58. 

Plasmid pCJW58 was used to transform the methylation- 
deficient E. coli strain ET 12567 (MacNeil D. J. et al. 
(1992) Gene 111:61-68) and the recovered, unmethylated 
plasmid was then used to transform the same E . coli strain 
ET12567 housing the plasmid pUB307, a derivative of RP4 
which is mob' and which contains a gene for kanamycin 
resistance ( Pif f aretti, J. C. et al. (1988) Mol. Gen. 
Genet. 212:215-218). Recombinants were plated on 2 x TY 
agar medium containing apramycin and kanamycin at final 
concentrations of 50 micrograms per ml and 50 micrograms 
per ml respectively. The plasmid content of recombinants 
was checked isolation of plasmid DNA and checking of the 
identity of these plasmids by restriction analysis. One 
such clone which contained both pUB307 and plasmid pCJW58 
was selected and used for further experiments. 

Construction of Streptomyces cinnamonensis (pCJW58) 
and production of monensins 

A single colony of E. coli ET12567 housing both 
pUB307 and pCJW58 was toothpicked into 3 ml of TY liquid 
medium, containing apramycin and kanamycin at 25 and 25 



micrograms respectively, and grown overnight at 37 °C. This 
culture was used to inoculate 25 ml of TY medium, 
supplemented with the same antibiotics at the same 
concentrations, and growth was continued until the 
5 absorbance at 600 nm ( 1 cm pathlength) was between 0.3- 

0.6. The cells were centrifuged (room temperature, 7 
minutes, 2000 x g) , resuspended in TY liquid medium (10 
ml) containing no added antibiotics, re-centrif uged as 
before, then resuspended in 2ml of TSB medium and placed 

10 on ice. Meanwhile, 0.5 ml of TSB medium was added to 100 

microL containing approximately 10 8 spores of S. 
cinnamonensls. After a brief heat shock, at 50°C for 10 
minutes, the suspension was briefly cooled, mixed with 
0.5 ml of donor E. coll cells, and plated on solid A 

15 medium, which has composition as follows: 

A medium 



Sigma wheat starch 5g 

Corn steep powder 1.25g 

20 Yeast extract 1 . 5g 

CaC0 3 - 173q 

FeS0< 6 mg 

DIFCO agar 10g 

H 2 0 to 500 ml 



25 pH adjusted to pH 7 with KOH . 
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And to which in addition was added 10 mM MgCl ? to a 

final concentration of 10 mM. 

The plates were allowed to dry overnight at room 
5 temperature, and were then allowed to incubate a further 

18 hours at 30°C. After this time each 25 ml plate was 

overlaid with a solution of apramycin (final concentration 

50 micrograms per ml) and nalidixic acid (final 

concentration 20 micrograms per ml) , and the plates were 
10 allowed to incubate for four days at 30°C. At this time 

individual colonies were toothpicked onto solid A medium 

and allowed to grow. Four representative colonies from 

the A medium plate were grown up in liquid modified YEME 

medium, which has composition as follows: 
15 Modified YEME medium 

Sucrose lOOg 

DIFCO Yeast extract 3g 

Bacto peptone 5g 

Oxoid Malt extract 3g 
20 Glucose * lOg 

H 2 0 to 1L — 

pH adjusted to pH 7.2 with NaOH. 

These cultures were used to provide a 2% vol/vol 

inoculum for 30 ml of modified YEME which was grown for 7 
25 days, and then transferred to SM16 medium, which has 
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composition as follows: 
SMI 6 medium 

3- [N-Morpholino] -propane sulfonic acid 

5 (MOPS) buffer 20. 9g 

L-proline 10. Og 

Glucose 20 9 

NaCl 0.5g 

K 2 HP0 4 2 V 1 9 

10 Ethylenediaminetetraacetic acid, sodium 

salt 0.25g 

MgS0 4 .7H 2 0 0.4 9g 

CaCl 2 .2H 2 0 0.029g 
Trace elements solution (Hopwood, 

15 D. A. et al. (1985) Genetic Manipulation 

of StreptomycGS - a Laboratory Manual, 

at p. 235) 2 ml 

0.5 M CoCl 2 solution 2 microlitres 
H 2 0 to 1L 

20 pH adjusted to pH 7 with NaOH. 

After growth for a further ^ days , mycelium was 

collected by centrif ugation at 2000 x g for 30 minutes, 
and the supernatant was extracted three times with 300 ml 
of ethyl acetate. The combined extracts were concentrated 

25 by evaporation under reduced pressure to an oil, which was 
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mixed with 1 ml of methanol. Samples were applied to an 
LCQ liquid chromatograph fitted with a mass spectrometer 
detector unit. The column used was a C18 reversed phase 
column, equilibrated with a mixture of 80% 20mM ammonium 
5 acetate/20% acetonit rile , and the column was "eluted with a 

gradient of increasing acetonitrile, reaching 100% 
acetonitrile over 24 minutes. Monensins A and B emerged 
from the column with retention times respectively of 8,2 
minutes and 9.2 minutes. The relative amounts of monensin 

10 produced by three independent clones (A-C) containing an 

additional copy of the monR gene were compared to a 
control fermentation of the wild type S. cinnamonensis 
strain, with the results shown in the Table below: 
Table showing increased monensin production in strains 

15 bearing additional copy of monR gene 

Strain monensin A monensin B 

concentration concentration 
(arbitrary units) (arbitrary units) 
Control ' 188 861 

20 A 430 1 800 

B 450 — 1 300 

C 249 1 300 

Example 6 

Construction of 5. cinnamonensis M12AT5 
25 A region lying immediately 5' of the DNA encoding the 
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acyltransf erase (AT12) domain of module 12 of the monensin 
polyketide synthase in the monensin biosynthetic gene 
cluster was amplified with the following primers: 5' - 
GGTGGCCACGGAAACACCAACACCGGACCCGCGCC- 3 ' , and 5' - 
5 CTCTCGGAGGCCCGGCGCAACGGCCACAA-3' , 3' using casmid MO-CN11 

as a template. The PCR product was ligated into Smal 
digested and phosphatase-treated plasmid pUC18 and the 
ligation mixture was used to transform E. coll DH10B 
cells. Transformant colonies were analysed for the 

10 presence of plasmid and the identity of the plasmid 

inserts was verified by sequencing. A plasmid whose 
insert contained a fragment upstream of the AT12-encoding 
sequence from about 82.3kb to 83.2kb of the mon cluster 
was designated pM081. Similarly a region lying immediately 

15 3' of the DNA encoding the acyltrans f erase (AT12) domain 

of module 12 of the monensin polyketide synthase in the 
monensin biosynthetic gene cluster was amplified with the 
following primers: 5' -GGCCTAGGGCTGCCTCGGGTGGTGGATCTGCCGA- 
3' and 5'- TGGTCGGGCGCGGTGCGTGCGATACGT~3' , using cosmid 

20 MO-CN11 as a template. The PCR product was ligated into 

Sinai-treated and dephosphorylate^- pUC18 and the ligation 
mixture was used to transform DH10B E.coll cells. 
Transformant colonies were analysed for the presence of 
plasmid and the identity of the plasmid inserts was 

25 verified by sequencing. A plasmid whose insert contained 

- 51 - 




a fragment downstream of the ATI 2 -encoding sequence, from 
80.5kb to 81.4kb of the mon cluster, was designated pM082 . 

The DNA encoding AT of module 5 was amplified and 
Mscl and Avrll restriction enzyme recognition sites were 
5 introduced at the ends by PCR using the following primers: 

5'-CCTGGCCAGGGCGGCCAGTGGGTGGGCATG-3' and 5'- 
GGCCTAGGGGTCGGCCGGGAACC AGCGCCGCCAGT - 3 7 and the cosmid M0- 
CN33 as a template. The PCR product was ligated into Smal- 
treated and dephosphorylated pUC18 and the ligation 

10 mixture was used to transform DH10B E.coli cells. 

Transformant colonies were analysed for the presence of 
plasmid and the identity of the plasmid inserts was 
verified by sequencing. A plasmid whose insert DNA, with 
sequence from about 44.2kb to 45.2kb of the mon cluster, 

15 encoded the ATS domain was designated pM08 3. 

pM081 was digested with Mscl and tfindlll and ligated 

to the 0.9kb Mscl - Hindi II fragment of pM082. A clone 

r 

containing both fragments was designated pM084. Plasmid 
pM084 was cleaved with Avrll and Hindlll, treated with 

20 phosphatase, and ligated together with the 1.0 kb Avrll - 

Hindlll fragment of pM083 to produce pM085, which contains 
the DNA encoding the AT 5 domain flanked by DNA from either 
side of the DNA encoding the AT12 domain of the monensin 
PKS . The thiostrepton resistance gene tsr, derived from 

25 plasmid pIJ702 (Katz, E . et al., J. Gen. Microbiol. 
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1983), was cloned into the Hindlll site of pM085. The 
resulting plasmid pM086 was analysed by its restriction 
pattern and confirmed to contain all the desired 
elements. 

5 Plasmid pM086 was used to transform S. crinnamonensis 

protoplasts as described by Hopwood, D. A. (1985). Stable 
thiostrepton-resistant transf ormants were isolated and 
checked for the desired integration of the pM085 into the 
AT12 flanking regions by Southern blot hybridisation. One 

10 such integrant, S. cinnamonensis MO-08, containing pM085 

integrated upstream of the AT12, was passed through 4 
cycles of sporulation on a non-selective nutrient 
medium. Spores obtained after the fourth cycle were 
replica-plated onto media with and without thiostrepton . 

15 DNA of clones that had lost thiostrepton resistance was 

analysed by Southern blot hybridisation. Clones in which 
the DNA encoding the AT12 domain had been replace by the 
DNA encoding the ATS domain was designated S. 
cinnamonensis M12-AT5. At this time individual colonies 

20 were toothpicked onto solid A medium and allowed to grow. 

Four representative colonies front" the A medium plate were 
grown up in liquid modified YEME medium, which has 
composition as follows: 
Modi f ied YEME medium 
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Sucrose lOOg 

DIFCO Yeast extract 3g 

Bacto peptone 5g 

Oxoid Malt extract 3g 

5 Glucose lOg 

H 2 0 to 1L 



pH adjusted to pH 7.2 with NaOH, 

These cultures were used to provide a 2% vol/vol 
inoculum for 30 ml of modified YEME which was grown for 7 
10 days, and then transferred to SM16 medium, which has 

composition as follows: 
SMI 6 medium 



3- [N-Morpholino] -propane sulfonic 

15 acid (MOPS) buffer 20. 9g 

L-proline 10 . Og 

Glucose 20g 

NaCl 0.5g 

K 2 HP0 4 2.1g 
20 Ethylenediaminetetraacetic acid, 

sodium salt 0.25g 

MgS<V7H 2 0 0.4 9g 

CaCl 2 .2H 2 0 0.029g 
Trace elements solution (Hopwood, 
25 D. A. et al. (1985) Genetic 
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Manipulation of Streptomyces - a 

Laboratory Manual, at p. 235) 2 ml 

0.5 M CoCl 2 solution 2 microlitres 

H 2 0 to 1L 

5 pH adjusted to pH 7 with NaOHl 

After growth for a further 7 days, mycelium was 
collected by centrif ugat ion at 2000 x g for 30 minutes, 
and the supernatant was extracted three times with 300 ml 
of ethyl acetate. To confirm presence of the C-2-ethyl 

10 substituents of both monensin A and B the combined 

extracts were concentrated by evaporation under reduced 
pressure to an oil, which was mixed with 1 ml of methanol. 
Samples were applied to ah LCQ liquid chromatograph fitted 
with a mass spectrometer detector unit. The column used 

15 was a C18 reversed phase column, equilibrated with a 

mixture of 80% 20mM ammonium acetate/20% acetoni t r ile , and 
the column was eluted with a gradient of increasing 
acetonitrile, reaching 100% acetonitrile over 24 minutes. 
Mass ions 14 mass units above those expected for both 

20 monensin A and B confirmed production of the respective C- 

2-ethyl substituents. — * 

Example 7 Construction of pSGK0.05 and its use in the 
production of C-13 propyl -erythromycin 

Plasmid pSGKOOS is a pCJR24 based plasmid containing 
25 a PKS gene comprising a loading module plus the first and 
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second extension modules and the chain terminating 
thioesterase of the PKS responsible for the production of 
erythromycin (DEBS) . The loading module comprises the KS 
and ethyl-malonyl CoA specific AT from module 5 of the 
5 monensin PKS linked to the DEBS loading ACP domain. In 

addition, the active site cysteine of this module 5 KS has 
been mutated to glutamine to convert an extender di-domain 
to a loading di-domain. Plasmid pSGKOOS was constructed 
as follows. 

10 A 2769bp DNA segment of the monensin cluster of S-. 

cinnamonensls extending from nucleotide 42438 to 45207 was 
amplified by PCR using the following oligonucleotide 
primers. 5' -GTGACGTCATATGTCGAGTGCTGAAGAGTCG- 3 r and 
5' -GGGGTCGCCTAGGAACCAGCGCCGCCAGTCGA-3' 

15 The design of these primers introduced A/de I and Avr 

II sites at the ends of the amplifed fragment. Monensin 
cosmid 05 was used as a template for the reaction. The 
resulting 2769bp fragment was digested with Nde I and Xho 
I- and a 656bp fragment (Fragment A) purified by 

20 preparative gel electrophoresis. 

A second PCR reaction was used with the same template 
to amplify DNA from nucleotide 43098 to 45207. The 
primers used were 

5 ' -CGGCCTCGAGGGCCCGTCGGTCAGTGTCGACACGGCGCAGTCCTCCTCGC- 3 ' 
25 and 5' -GGGGTCGCCTAGGAACCAGCGCCGCCAGTCGA-3 ' 
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The design of the upstream oligonucleotide primer 
incorporated a change of the codon specifying the KS 
active site cysteine (nucleotides 43135-43137, TGC) to 
glutamine (CAG) . The resulting 2109bp DNA fragment 
5 (Fragment B) was digested with Xho I and Avr "II and 

purified by preparative gel electrophoresis. 

Plasmid pCJW80 is derived from pCJR24 and DEBS1-TE in 
which Msc I and Avr II sites have been introduced to flank 
the AT of the DEBS loading module. This plasmid was 
10 digested with Nde I and Avr II and the larger fragment 

(Fragment C) purified by preparative gel electrophoresis. 

The three fragments (Fragments A, B, C) were ligated 
together using T4 DNA ligase and the ligation mixture used 
to transform electrocompetent E. coli DH10B cells. 
15 Individual clones were checked for the presence of the 

desired plasmid pSGKOOS. The identity of pSGKOOS was 
confirmed by restriction pattern and sequence analysis. 

Plasmid pSGKOOS was used to transform S. erythraea 
NRRL2338 using a routine protoplast transformation 
20 technique. Thiostrepton resistant colonies were selected 

on R2T20 media containing g/mT" thiostrepton . Further 
analysis confirmed that pSGKOOS had integrated into the S. 
erythraea NRRL2338 chromosome by Southern blot 
hybridisation of their genomic DNA with DIG-labelled DNA 
25 containing the actll orf4 promoter. The culture S. 
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eryt^raea NRRL2338 (pSGKOOS) was inoculated into 5ml tap 
water medium in a 30ml flask. After three days 
incubation at 29°C this flask was used to inoculate 30ml of 
Ery-P medium in a 300ml flask. The broth was incubated at 
5 29°C at 200rpm for 6 days. After this time the whole broth 

was adjusted to pH8 . 5 with NaOH, and then extracted twice 
with an equal volume of ethyl acetate. The ethyl acetate 
extract was evaporated to dryness at 45°C under a nitrogen 
stream using a Zymark Turbovap LV evaporator. The product 
10. identities were confirmed by LC/MS. A peak was observed 

with a m/z value of 734 (M+H) + required for erythromycin A. 
A second peak was observed with a m/z value of 748 (M+H) + , 
required for 13-propyl erythromycin A. 



15 
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TABLE I 



gene 


function 


start 


Gnd 


gdhA 


glutamate dehydrogenase (partial) 


1038 


n 


dapA 


dihydrodipicolinate synthase 


2140 


1220 


orf3 


Dutative tran^crintinnal activator 


221 1 


^1 S2 


orf4 


hvnothf^Iirtal nrntpin 




ooou 


orf5 


hvnotriptinal nrofpin 


4307 


3684 


orf6 


hypothetical protein 


4570 


4753 


orf7 


hypothetical protein 


5058 


5612 


acpX 


acyl carrier protein 


6010 


5693 


ksX 


ketoacyl synthase 


8531 


6045 


monCI 


probable epoxihydrolase/cyclase 


9542 


8643 


monE 


methyltransf erase 


10426 


9596 


monT 


monensin resistance gene (ABC- 


10656 


12191 


monRI 


probable repressor 


12205 


12780 


monAl 


thioesterase 


13829 


13023 


monAI 


polyketide synthase loading & 


14121 


23198 




KS-L 


14172 


15486 




AT-L malonate specific 


15777 


16880 




ACP-L 


17019 


17276 




KS1 


17358 


18626 




AT1 methylmalonate specific 


18960 


19976 




DH1 (ootentiaU 


20019 


20519 




KR1 (inactive) 


21636 


22241 




ACP1 


22536 


22793 


monAl 


nnlvkf tid*» ^vntha^f* moriulf* 2 


23205 


29921 




KS2 


23307 


24569 




AT2 methylmalonate specific 


24891 


25913 




DH2 


25953 


26369 




ER2 


27600 


28463 




KR2 


28485 


29042 




ACP2 


29313 


29570 


monAI 


polyketide synthase modules 3 & 4 


29974 


42372 




KS3 


30076 


31347 




AT3 malonate specific 


31798 


32838 




DH3 


32884 


33465 




KR3 


34692 


35181 




ACP3 


35553 


35811 




KS4 


35899 


37170 




AT4 methylmalonate specific 


37489 


38511 




DH4 


38557 


38982 




ER4 


40123 


40986 




KR4 


41005 


41562 




ACP4 


41848 


42105 


monAl 


polyketide synthase modules 5 & 6 


42448 


54564 




KS5 


42628 


43890 




ATS etbylmalonate specific 


44221 


45243 




DH5 


45289 


45744 




KR5 


46785 


47337 




ACP5 


47593 


47850 
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KS6 


47947 


49218 




AT6 ma Ion ate specific 


49579 


50601 




DH6 


50644 


51075 




ER6 


52222 


53102 




KR6 


53101 


53661 




ACP6 


54052 


54306 


monA 


polyketide synthase modules 7 & 8 


54614 


66934 




KS7 


54716 


55978 




ATT methylmalonate specific 


56300 


57319 




DH7 


57358 


57802 




KR7 


59048 


59608 




ACP7 


59867 


60124 


- 


KS8 


60185 


61453 




AT8 malonate specific 


61808 


62839 




DH8 


62882 


63316 




ER8 


64577 


65437 




KR8 


65456 


66016 




ACP8 


66404 


66661 


monA 


polyketide synthase module 9 


66952 


72054 




KS9 


67075 


68340 




AT9 malonate specific 


68698 


69729 




KR9 (potential) 


70735 


71262 




ACP9 


71536 


71783 


monH 


probable regulator 


72051 


74993 


monCI 


FAD containing epoxidase 


76541 


75051 


monBI 


double bond isomerase 


76960 


76538 


mnnRI 


double bond isomerase 


77450 


77016 


monA 


polyketide synthase modules 1 1 & 


88708 


77447 




KS11 


88612 


87344 




AT1 1 methylmalonate specific 


87022 


85993 




KR11 


851 1 1 


84562 




ACP11 


84292 


84035 




KS12 


83962 


82694 




AT12 methylmalonate specific 


82354 


81335 




DH12 (potential) delta 


81286 


80855 




ER12 (potential) 


79618 


78914 




KR12 


78895 


78337 




ACP12 


78070 


77812 


monA 


polyketide synthase module 1 0 


93741 


88816 




KS10 


93636 


92368 




AT10 methylmalonate specific 


92040 


91021 




KR10 


90132 


89584 




ACP10 


89322 


89068 


monD 


P450 oxygenase 


94081 


95273 


monRI 


probaWe activator 


96141 


95338 


monA 


thioesterase 


96941 


96138 


orf29 


cell wall biosynthesis capK 


97580 


98953 


lipB 


lipase B 


99983 


98991 


orf31 


ion pump 


101433 


100507 


orf32 


membrane structural protein 


102581 


101490 


amtA 


glycine amidinotransferase 


102924 


103450 
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TABLE II 

GdhA, glutamate dehydrogenase (partial coding sequence) Length: 346 
amino acids 

1 LTTRPDTKTA LSQKTALiSQti LTE I EHRNPA QPEFHQAARE VLETLAPVIA 
51 ARPEYAEAGL IERLCEPERQ IVFRVPWQDD HGRVRVNRGF RVEFNSALGP 
101 YKGGLRFHPS VNLGVIKFLG FEQIFKNALT GL,GIGGGKGG SDFDPRGRSD 
151 AEVMRFCQSF MTELYRHIGEJKTDVPAGDIG VGGRE IGYLF GQ YRRITNRW 
201 EAGVLTGKGR NWGGSLIRPE ATGYGNVLFA AAMLRERGET LEGRTAWSG 
251 SGNVAIYTIQ KLAALGANAV TCSDSSGYW DE KG I D LDLL KQVKEVERAR 



301 


VDTYAQRRGA 


SARFVPGRRV 


WEVPADIALP 


SATQNELDAD 


DATALI 


DapA, dihydrodopicolinate synthase Length: 307 amino acids 


l 


MTLAS SLEPT 


TEPLFNGLYV 


PLVTPFTDDL 


RliAPEALARL 


ADEALSAGAS 


51 


GLVAIiGTTAE 


AATLTAEERE 


TVIRVCSAAC 


RAHGAPLIVG 


VGTNDTATA I 


101 


TALRE LAARG 


DVAAALVPAP 


PYIRPGEAGT 


IiAHFAAIiAEH 


GGLPLWYDI 


151 


PYRTGQTLGA 


GTITALGRLP 


EWGIKHATG 


SIDPTTMELL 


DSPLPGFAVTi 


201 


GGDDIVLSPL 


VAAGAHGGIV 


ASANLRTADY 


AEMIALWRRG 


SAAPARALGA 


251 


DLARLSAALF 


TEPNPTVIKG 


VLHAQNR IPS 


PAVRMPL.LAA 


SADSVRRAAP 


301 


LAASRK* 










ORF3, putative transcriptional activator protein Length: 314 amino aci 


l 


MLDVRRLHLL 


RELDRRGT I A 


AVAEALTFTA 


SAVSQQLGVL 


EREAGVPLLE 


51 


RSGRRWLTP 


AGRSLVAHAD 


AVIiNRLEQAV 


AELAGARDGI 


GGPLRIGTFP 


101 


SGGHTIVPGA 


LAELASRHPA 


LEPMVREIDS 


ARVSDGLRAG 


ELDVALVHDY 


151 


DFVPATPDTT 


VDEVPLLEEP 


MYLVTHAADT 


ATDSGSGSTL 


AALLG PCAEV 


201 


PWI TARDGTT 


GHAMAVRACQ 


AAGFQPRIRH 


QVNDFRTVLA 


LVAAGQGAGF 


251 


VPRMAAEPSP 


AGWLTKLPL 


FRRSKVAFRA 


GGGAHPAIAA 


FVAAATTAVE 
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3 01 RMAGSRGPAG GSE* 

ORF4, hypothetical protein Length: 139 amino acids 

1 MADDAYliFLL PDRHPRLGAA IuAAVGALECT ETPAVHAWLQ AHEASVSSEQ 
51 VRILPADAET LIPKDAERLP VPLSEEEALK VEQECAPQTV TDMESELLAF 
101 RETTQDWQAL VHRALTAGIP AQRIARLTGL DPEEIGRL* 

ORF5, hypothetical protein Length: 208 amino acids 

1 LAVAACAAW LPIDAWRIS AADVGVLVFF AYLLPYLAIT MTVFVSVAPE 
51 QVRSWARREA RGTFLQRYVL GTAPGPGGSL FIAAAALWA VLWLPGHLST 
101 TFSALPRTLV ALALWAAWI CWVAFAVTF QADNLVENER AL.EFPGERSP 
151 AWADYVYFAL AAMTT FGTTD VDVTSRDMRR TVAANTVIAF VFNTVTVAIL 
201 VSALGGR* 

ORF6, hypothetical protein Length: 63 amino acids 

1 MTVMDKL KQM LKGHEDKAGQ GIDKAGDFVD GKTQGKYSGQ VDTAQDKLRD 
51 QFGSDQQEPP QR* 

ORF7, hypothetical protein Length: 185 amino acids 

1 MGTAQSQEQA AAPGACAAFV RFVLCGGGVG LASSFAWAL ASWVPWALAN 

51 ALVAWSTW ATELHARFTF GAGGRATWRQ HAQSAG S AAA AYAVTCVAMF 

101 VXiQQLVAAPG AVLEQWYLS ASALAGVARF WLRLWFAR NRSLPAAAAV 

151 RTARPVRRVP APVPATVAHA ASRPAGPAAL CPAA* 

AcpX, acyl carrier protein (ACP) Length: 106 amino acids 

1 MTSTDHTSGQ DATELEKQLA AATPEEREKL L.TDT I RTQ AG TLLJsTTTLSDD 

51 SNFLENGLNS LTALELTKTL MTLTGMEIAM VAIVENPTPA QLAHHLGQEL 
101 AHTTA* 
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KsX, ketoacyl-ACP synthase Length: 829 amino acids 



1 


VANEEKLVEY 


LKWTTAELHQ 


AQQQLRELKA 


AQHEPIAWS 


MACRLPGKTR 


51 


TPDDLWDLVS 


EGRDAVTGFP 


DDRAWELPEE 


RPYAELGGFL 


DDAAGFDAGF 


101 


FDISDTEAVA 


TEPLQRLMLH 


LAWETVERGH 


IAPHTLRSTL 


TGVYVGATGH 


151 


DYATRLETAP 


DELLPYLGGG 


TSGSLVSGRI 


AYALGLEGPA 


ISVDTACSSS 


201 


LVALHLACQA 


LRRGECGLAL 


AGGGT VMS T P 


HTFHAFAHQK 


SLAQDGRCKP 


251 


- FAAAADGMGL 


GEGVGLVLLE RLGDARKNGH 


PVLAVIRGSA 


VNQDGAGYGL 


301 


AAPNGPSQQH 


VI RAALADAG 


LTPDQ I DAVE 


AHGTGTP I GD 


AIEVQALLAT 


351 


YGADRSPDRP 


LWLGSVKSNT 


GHTQGAAGAA 


ALIKMVQAFR 


HGTLPPTLHV 


401 


DRPTPLAAWK 


KGAVRLLTEA 


VDWPRREE PR 


RVGISAFATS 


GTNAHLILEE 


451 


PPVDEAPVPD 


AARDQTS P VA 


PELPVAWSLS 


ARTPEALRAQ 


AKAIjVTHIiAA 


501 


TDPAPSPAEV 


AYSLAATRSP 


LEHRAVLTGT 


DHTELLAAAR 


alaagedhpd 


551 


LVRSTPGAGP 


KKIAWHFDGR 


PADGVTTGAA 


PGAKPGATFG 


atfgaafgga 


601 


EFHSAFPLFA 


SAFDEARALL 


DTHLPTPLPT 


PHSELARFAV 


htalarllle 


651 


TGVRPHTLTG 


DGVGHIAAAY 


AAGILTLDDA 


CRLAAAHAAA 


aqaaegeqpa 


701 


P PDAYE P VLK 


QLTFQRATLT 


LTS TAPADTP 


IASADYWHHH 


ltspaptapp 


751 


TPETHTLLHL 


GALS PEGTQT 


SAVSALLTAL 


ARLHTTGGTV 


dwtplvrrtp 


801 


HPRTI DLPTY 


SFQATRYWLH 


DHTAHAAV* 






MonCII, probable epoxyhydrolase/cyclase Length: 300 amino acids 


i 


VKKLiRI PVSQ 


TVS LNVRYRP 


ADGPGAPGRP 


FLLLHGMLSN 


armwdevaar 


51 


LAAAGHPAYA 


VDHRGHGESD 


TP PDGYDNAT 


WTDLVAAVT 


aldlsgalva 


101 


GHSWGAHLAL 


RLAAEH PDL V 


AGLALIDGGW 


YEFDGPVMRA 


fwertadwr 


151 


RAQQGTTSAA 


DMRAYLRATH 


PDWSPTSIEA 


RLADYRVGPD 


glliprltst 


201 


Q VMS I VAGLQ 


REAPADWYPK 


VTVPVRLLPL 


IPAIPQLSDQ 


VRAWVAAAEA 
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2 51 ALEQVSVRWY PGSDHDLHAG APDEIAAJDLL LLARSCEAMP GGKAGVRPA* 

MonE, S-adeonosylmcthioninc-dependent methyltransferase Length: 277 
amino acids 

1 VNKTVAPEPS DIGHYYDHKV FDLMTQLGDG NLHYGYWFDG GEQQATFDEA 
51 MVQMTDEMIR RLDPA PGDRV LtDIGCGNGTP AMQLARARDV EWGISVSAR 
101 QVERGNRRAR EAGLADRVRF EQVDAMNLPF DDGSFDHCWA LESMLHMPDK 
151 QQVXiTEAHRV VKPGARiMPIA DMVYLNPDPS RPRTATVSDT TIYAAL.TDIG 
201 DYPDI FRAAG WTVLELTDIT RETAKTYDGY VEWIRAHRDE YVDIIGVEGY 
251 EIiFLHNQAAL GKMPELGYIF ATAQRP* 

MonT, putative monensin resistance gene (ABC-transporter) Length: 512 
amino acids 



1 


MSADLGARRW 


WAVGALiVLAS 


MWG FDVT I L 


SLALPAMADD 


LGANNVELQW 


51 


FVTS YTLVFA 


AGMI PAGMLG 


DRFGRKKVLL 


TALVIFGIAS 


LACAYATSSG 


101 


TFIGARAVLG 


LGAALIMPTT 


LSLLPVMFSD 


EERPKAIGAV 


AGAAMLAYPL 


151 


GPILGGYLLN 


HFWWGSVFLI 


NVPWILAFL 


AVSAWLPESK 


AKEAKPFDIG 


201 


GLVFSSVGLA 


ALTYGVI QGG 


EKGWTDVTTL 


VPC I GG LLAL 


VLiFVMWEKRV 


251 


ADPLVDLSLF 


RSARFTSGTM 


LGTVINFTMF 


GVLFTMPQYY 


QAVLGTDAMG 


301 


SGFRLLPMVG 


GLLVGVTVAN 


KVAKALGPKT 


AVGIGFALLA 


AALFYGATTD 


351 


VSSGTGLAAA 


WTAAYGLGLG 


IALPTAMDAA 


LGALSEDSAG 


VGSGVNQSIR 


401 


TLGGS FGAAI 


LGSILNSGYR 


GKLDLDGVPE 


Q AHG A VKD S V 


FGGLAVARAI 


451 


KSNGLADSVR 


SAYVHAXtDW 


IiVVSGGLGLIj 


GWLAWWLP 


RHVGOSTAKT 


501 


AESEHEAADA 


V* 








MonRII, probable repressor protein Length: 


192 amino acids 


i 


VPGLRERKKA 


RTKAAIQREA 


VRLFREQGYT 


ATT I EQ IAEA 


AEVAPSTVFR 


51 


YFATKQDLVF 


SHDYDLPFAM 


MVQAQSPDLT 


PIQAERQAIR 


SMLQDISEQE 
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] \ S EPTFM HFJR* Y000 ' 

101 LALQRERFVL ILSEPELWGA SLGNIGQTMQ XMSEQVAKRA GRDPRDPAVR 
151 AYTGAVFGVM LQVSMDWAND PDMDFATTLD EALHYLEDLR P* 

MonAIX, thioesterase Length: 269 amino acids 

1 MDRGTAARAP QIGDEFGAAT GNGVWLRRYH AAAEAPVRLV CFPFAGGSAS 
51 YYFGLSGLLA PGVEVLAVQY PGRQDRHAEP CLASVAELAD J^WPHLPCDG 
101 KPFALFGHSL GAIVAFEVAR RLRGPAGPGL PVHLFVSGGJL ARPYRPAGRS 



151 


GAFGDADI LA 


HLRAMGGTDE_ RFFRSPELQE 


LVLPALRADY 


RAVATYEAPG 


201 


PGRLDCPITA 


LIGDADERTS 


PEQAATWRER 


TGAAFDLRVIj 


PGGHFYLDGC 


251 


QEQVAAWTE 


ALTAGPGV* 








MonAI, polyketide synthase multi-enzyme MONS1, housing loading mc 
and extension module 1 Length: 3026 amino acids 


l 


MAASASASPS 


GPSAGPDPIA 


WGMACRLPG 


APDPDAFWRL 


LSEGRSAVST 


51 


APPERRRADS 


GLHGPGGYLD 


RIDGFDADFF 


HIS PREAVAM 


DPQQRLI/LEL 


101 


SWEALEDAGI 


RPPTLARSRT 


GVFVGAFWDD 


YTDVXjNLiRAP 


GAVTRHTMTG 


151 


VHRSIIiANRI 


SYAYHLAGPS 


LTVDTAQSSS 


LVAVHLACES 


IRSGDSDIAF 


201 


AGGVNLi I CS P 


RTTELAAARF 


GGLSAAGRCH 


TFDARADGFV 


RGEGGGLWIj 


251 


KPLAAARRDG 


DTVYCVIRGS 


AVNSDGTTDG 


ITLPSGQAQQ 


DWRLACRRA 


301 


RITPDQVQYV 


ELHGTGTPVG 


DPI EAAALGA 


ALGQDAARAV 


PLAVGSAKTN 


351 


VGHLEAAAGI 


VGIjIiKTALS I 


HHRRLAPSLN 


FTTPNPAI PL 


adlgltvqqd 


401 


LADWPRPEQP 


LIAGVSSFGM 


GGTNGHWVA 


AAPDSVAVPE 


pvgvpervev 


451 


PEPWVSEPV 


WPTPWPVSA 


HSASALRAQA 


GRLRTHLAAH 


RPTPDAARVG 


501 


HALATTRAPL 


AHRAVLLGGD 


TAELLGSLDA 


LAEGAETAS I 


VRGEAYTEGR 


551 


TAFLF SGQGA 


QRLGMGRELY 


AVFPVFADAL 


DEAFAALDVH 


LDRPLREIVL 


601 


GETDSGGNVS 


GENV I GEGAD 


HQALLDQTAY 


TQPALFAIET 


SLYRLAASFG 
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651 LKPDYVTjGHS vgeiaaahva gvlslpdasa lvatrgrlmo avrapgamaa 

701 wqatadeaae qlagherhvt vaavngpdsv wsgdratvd eltaawrgrg 

751 rkahhlkvsh afhsphmdpi ldelravaag ltfhepvipv vsnvtgelvt 

801 atatgsgagq adpeywarha repvrflsgv rglcergvtt fvelgpdapl 

851 samardcfpa padrsrprpa aiatcrrgrd evatflrsla qayvrgadvd 

901 ftraygatat rrfplptypf qrerhwpaaa gvgqqpetpe lpessesseq 

951 aghereegar awggpegrla glsvndqerv llglvtkhva wlgdasgtv 



1001 


QAARTFKQLG 


FDSMAAAELS 


ERLGTETGLP 


LPATLTFDYP 


TPLAVAAHLR 


1051 


AELTGTPAPA 


GSAPATGALG 


AGDLGTDEDP 


VAIVAMSCRY 


PGGAGTPEDL 


1101 


WRLVADGADA 


IGDFPTDRGW 


DLARLFHPDP 


DRSGTSCTRQ 


GGFLYDAADF 


1151 


DAEFFDISPR 


EALAVD PQQR 


LLLECAWEAF 


ERAGLDPRAL 


KGS PTGVFVG 


1201 


MTGQDYGPRL 


HEPSQATDGY 


LLTGSTPSVA 


SGRLSFSFGL 


EGPAliTVDTA 


1251 


CSSSLVTLHL 


AAQALRRGE C 


D LALAG G ATV 


liATPGMFTEF 


SRQRGIjAPDG 


1301 


RCKPFAAGAD 


GTGWAEGVGL 


VLLERLSEAR 


R KGHA VLiAV I 


RGSAINQDGA 


1351 


SNGLTAPNGP 


SQQRVIRAAL 


AAARLTADEV 


DWEAHGTGT 


TLGDPIEAQA 


1401 


LLATYGQGRS 


AERPLWLGSV 


KSNIGHTQAA 


AGVAGVIKMV 


MAMRHDLLi PA 


1451 


TLHVDEPSGH 


VDWSTGAVRL 


LTEPWWPRG 


ERPRRAAVSS 


FGISGTNAHL 


1501 


VLEEAGQDEY 


VAGAADDAGP 


VDGAVLPWW 


SGRTGAALRE 


QARRLRELVT 


1551 


GGSADVSVSG 


VGRSLVTTRA 


VFEHRAVWG 


RDRDTL I GGL 


EAIiAAGDASP 


1601 


DWCGVAGDV 


GPGPVLVFPG 


QGSQWVGMGA 


QLLGESAVFA 


ARIDACEQAL 


1651 


SPYVDWSLTE 


VLRGDGRELS 


RVDWQPVLW 


AVMVSLAAVW 


ADHGVTPAAV 


1701 


VGHSQGEIAA 


WVAGAXtTLE 


DGAKIVALRS 


RALRQLSGGG 


AMASLGVGQE 


1751 


QAAELVEGHP 


GVGIAAVNGP 


SSTVISGPPE 


QVAAWADAE 


ARELRGRVID 


1801 


VDYASHSPQV 


DAI TDELTHT 


LSGVRPTTAP 


VAFYSAVTGT 


RIDTAGLDTD 
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1851 


YWVTNLRRPV 


R F ADA VTALL. 


ADGHRVFIEA 


SSHPVLTLGL 


QETFEEAGVD 


1901 


AVTVPTLRRE 


DGGRARLARS 


LAQAFGAGCA 


VRWENWFPAT 


GTSTVELPTY 


1951 


AFQRRRYWLE 


APTGTQDAAG 


LGLAAAGHPL 


LGAATE I ADG 


DIRLLTGRIS 


2001 


RHSHPWLAQH 


TLFGAAWPA 


SVLAEWALRA 


ADEAGCPRVD 


DLTLRTPLVL 


2051 


PETAGVQVQI 


WGPADARDG 


HRDFHVYARP 


DGKDAS EGEG 


I AEGEGAS EG 


2101 


EGASGGTDAP 


WTCHADGRLV 


AEPTGTASED 


SPDTVWPPPG *AEPVDL»GDFY 


2151 


ERAAATGVGY 


GPVFTGLRAXi 


WRRDGELFAE 


AVLPQEAPET 


AGFGMHPALL 


2201 


DAALHPALLG 


ERPAEEDKVW 


LPFTLTGVTL 


WATGATS VRV 


RLTPLDDDPD 


2251 


ASADGRAWRV 


GVSDPTGAEV 


LTCEALVAVA 


AGRRELRAAG 


ERVSDL.YAVE 


.2301 


WVPVPGPGPV 


GEGADFSGWA 


GLGECGERWE 


CVGRVERWYE 


DLDALGAAVE 


2351 


GGASVPSWL 


ATAAAAPGGA 


GDGAADAIj S A 


VRWTGALLDQ 


W LAD AR FAD A 

* 


2401 


RLWI TSGAV 


ATGDDFLPDP 


AAAAVRGLVE 


QAQVRHPGRI 


LLVDTEAGAG 


2451 


LGVGAGVDDA 


LLEQAVAMAXi 


GADEPQLALR AGRVLAPRLT 


APQDAAVTEA 


2501 


ARPLDPDGTV 


LITGPAGAPV 


ADLAEHLVRT 


GQCRHLLKLP 


GDGELEEMAE 


2551 


ELRGLGATVD 


LSTADPADPT 


ALiAEWAAVE 


GDHPLTGV I H 


ATGVVDAFDP 


2601 


GDSASDLMID 


SASDSFAEAW 


S S RAG VTAAla 


HTATAHLPLD 


LFAVL S P AGA 


2651 


DLGIARSAAA 


AGADAFSAAL 


ALRRHTTVTT 


DTTAP PRTTA 


PPRTTAS PRT 


2701 


TALS S S RTTG 


VAIiAYGPPTA 


PRPG I KGTAP 


GRI PVLLDAA 


RAHGGGSPLL 


2751 


GARLAARALA 


AESAAEGVAG 


LPAPLRALAV 


AAAAAGAPTR 


RTAADRKPPA 


2801 


DWPARLAPLS 


APEQLRlfLID 


AVRTHAAAVL 


GRTDPEALRG 


DATFKQLGLD 


2851 


SliTAVELRNR 


LVEDTGLRLP 


TALVFRYPTP 


AAIAAHLRER 


LTSPSETTAT 


2901 


QRSGGQTPAA 


GQASSALAPG 


GSAAGPPAAD 


TVLSDLTRME 


NTLSVLAAQL 


2951 


PHTETGEITT 


RLEALLTRWK 


TTNATANDSG 


DGNGGDDDAA 


ERLKAASADQ 


3001 


IFDFIDNELG 


VGHGTSRVTP 


TPKAG* 
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MonAII, polyketide synthase multi-enzyme MONS2, housing extension 
module 2 Length: 2239 amino acids 



1 


MASEEQLVEY 


LRRVTTELHD 


TRRRLVQEED 


RRQEPVAL.VG 


MACRFPGGVA 


51 


SPEDLWDLVA 


AGKDAIEDFP 


TDRGWDLEAL 


YDPDPAAYGT 


SYVRHGGFVD 


101 


DAGSFDADFF 


GISPREALAM 


DPQQRLMLjET 


SWELFERAGI 


EPVSLKGSRT 


151 


GVYAGVSSED 


YMSQLPRIPE 


GFEGHATTGS 


LTSVISGRVA 


YNYGLEGPAV 


201 


TVDTACSASL 


VAIHLASQAL 


RQRECDLALA 


GGVLVliSSPL 


MFTEFCRQRG 


251 


LiAPDGRCKPF 


AAAADGTG F S 


EGIGLLLLER 


LSDARRNGHK 


VLAVIRGSAV 


301 


NQDGASNGLT 


APNDAAQEQV 


IRAALDNARL 


TPS E VDAVEA 


HGTGTKLGDP 


351 


IEAGALLATY 


GQHRARPLLL 


GSLKSNIGHT 


HATAG VAG V I 


KTVMAIRNGL 


401 


LPATLHVEEL 


SPHVDWDAGA 


VEWTEPTPW 


PETGHPRRAG 


VSAFGISGTN 


451 


AHLILEEAPP 


EEDVPAPVW 


ESGGWPWW 


SGRTPEALRE 


QARRLGEFVA 


501 


GDTDALPNEV 


GWSLATTRSV 


FEHRAVWGR 


DRDALTAGLG 


AIjAAGEASAG 


551 


WAGVAGDVG 


PGPVLVFPGQ 


GAQWVGMGAQ 


LLDESAVFAA 


RIAECERALS 


601 


AHVDWSLSAV 


LRGDGSELSR 


VEWQPVLWA 


VMVSLAAVWA 


DYGVTPAAVI 


651 


GHSQGEMAAA 


CVAGALS LED 


AARIVAVRSD 


ALRQLQGHGD 


MASLSTGAEQ 


701 


AAELIGDRPG 


WVAAVNGPS 


STVISGPPEH 


VAAWADAEA 


rglrarvidv 


751 


GYASHGPQID 


QLHDLI/TERL 


AD I RPTNTDV 


AFYSTVTAER 


IjTDTTALDTD 


801 


YWVTNLRQPV 


RFADTIEALL 


ADGYRLFIEA 


SAHPVLGLGM 


EETIEQADMP 


851 


ATWPTLRRD 


HGDTTQLTRA 


AAHAFTAGAD 


VDWRRWFPAD 


PAPRTIDLPT 


901 


YAFQRRRYWL 


ADTVKRDSGW 


DPAGSGHAQL 


PTAVALADGG 


WLNGRVSAE 


951 


RGGWLGGHW 


AGTVLVPGAA 


LVEWVLRAGD 


EAGCPSLEEL 


TLQAPLVLPE 



1001 SGGLQVQVW GAADEQGGRR DVHVYSRSEQ DASAVWQCHA VGELGRASVA 
1051 RPVRQAGQWP PAGAEPVEVG GFYEGVAAAG YEYGPAFRGL RAMWRHGDDL 
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1101 


LAEVEL.PEEA 


GSPAGFGIHP 


ALLDAALHPL 


LAQRSRDGAG 


AGAHGGQVLL 


1151 


PFSWSGVSLW 


AS EATTVRVR 


LTGLGGGDDE 


TVS LTVTDPA 


GG PWDVAEL 


1201 


RLRSTSARQV 


RGSAGPGADG 


LYELRWTPLP 


EPLPVPAPAN 


GRDVAADLSG 


1251 


CAVLGELVAE 


PGPGIDLEGC 


PCYPGVGALA 


DNASPPSMIL 


APVHSDTTGG 


1301 


DGLAI/TERVL 


RVIQDFLiAAP 


SLEQKQTRLA 


FVTRGAADTG 


STTGGSAAPA 


1351 


EAVDPAVAAV 


WGLVRSAQSE 


NPGRFVLLDT DAPLDQASVA ^LVDAVRSAV 


14 01 


EADEPQVALR 


GGRLLVPRWA 


RAGEPVELAG 


PAGARAWRLV 


GGDSGTLEAV 


1451 


VAEACDDIVL 


RPLAPGQVRV 


AVHTAGVNFR 


DVLIALGMYP 


DPDALPGTEA 


1501 


AGWTEVGPG 


VTRLSVGDRV 


MGMMDGAFGP 


W A VAD ARM LA 


PVPPGWGTRQ 


1551 


AAAAPAAFLiT 


AWYGLVELAG 


LjKAGERVXi I H 


AATGGVGMAA 


VQIARHVGAE 


1601 


VFATAS PGKH 


AVLEEMGIDA 


AHRAS SRDIiA 


FEDAFRQATD 


GRGVDWLNS 


1651 


LTGELLDASL 


RLLGDGGRFV 


EMGKSDPRDP 


ELVALEHPGV 


SYEAFDLVAD 


1701 


AGPERLGLML 


DRLGELFAGG 


SLVPLPVTAW 


PLGRAREALR 


HMSQARHTGK 


1751 


LVLiDVPAPLD 


PDGTVLVTGG 


TGTIGAAVAE 


HLARTGESKH 


LLIVSRSGPA 


1801 


AHGAEELVSR 


I AE FGAEATF 


VAAD VS E PDA 


VAALIEGIDP 


AHPLTGWHA 


1851 


AG VIiDMAL I G 


SQTTESLTRV 


WAAKAAAAQQ 


LHEATRESRL 


GLFVMFSSFA 






VC A TV XT A 'V/"'T"\7V 


LAALRRAEGL 


AGL S VAWGLW 


EATSGLTGTIj 


1951 


S AADRAR I DR 


YGIRPTSAAR 


GCALLlAAARA 


HGRPDLLAMD 


LDARVPAASD: 


2001 


APVPAVLRTL 


AAAGAPATAR 


PTAAAAADGA 


TDWSGRLAGL 


teearlellt 


2051 


ELVCTHAAGV 


LGHADAGAVQ 


VDAPFKELGF 


DSLTAVELRN 


RIAAATGLKL 


2101 


PAALVFDYPQ 


ARVLiAAHIiAE 


RLVPEGAGAM 


GGVSGAEGVR 


DAYGAGGPGG 


2151 


DMTAQVLLEV 


ARVEHTLSAA 


VPHGLDRAAV 


AARLEALLAR 


CTATTAATGA 


2201 


AGAAVEGDGD 


SDGDGAVDQI* 


ETATAEQVLD 


FIDNELGV* 





MonAIII, polyketide synthase multi-enzyme MONS3, housing extension 
modules 3 and 4 Length: 4133 amino acids 
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1 1 SEPTEMBER 20C0 



1 


MVSEEKLVDY 


LKRVSADLHA 


TRQRLREAEE 


RGQEPVAWE 


AACRYPGGIR 


51 


TPEDLWDLVA 


AGGNALGAFP 


DNRGWDLRRL 


FHPDPDHPGT 


TYAREGGFLH 


101 


DADLFDPEFF 


GISPREAAVL 


DPQQRLLLEC 


AWEALERAG I 


DPRSLQGSRT 


151 


GVYAGAALtPG 


FGTPHIDPAA 


EGHLVTGSAP 


SVLSGRLAYT 


FGLEG PAVT I 


201 


DTACSSSLVA 


VHLAAHALRQ 


RECDLALAGG 


VTVMTTPYVF 


TEFSRQRGLA 


251 


ADGRCKP FAA 


AADGTAFSEG 


AGLLVLERLS 


DARRAGHRVL 


AVIRGSAVNQ 


301 


DGASNGLTAP 


NGPAQQRV I R 


AALAGARLS P 


AEVDAVEAHG 


TGTRLGDP I E 


351 


ADALLATYGQ 


ERHGGRPLWL 


GSVKS NIGHT 


QGAAGAAGL I 


KMVQALRHET 


401 


LPATLYADEP 


TPHADWESGA 


VRLLSAPVAW 


PRGEHGEHTR 


RAGISSFGIS 


451 


GTNAHLILEE 


APAADAEGAG 


GDGDGDGGGV 


RPWRVGATG 


PREEQGQGQG 


501 


QEQHQQQRQQ 


RQRSSMMPTP 


HLPWLLSARS 


PAALRAQADA 


LANHVAHADH 


551 


SIADIGGTLL 


RRTLFEHRAV 


VLGTDRDERA 


AALAALAAGR 


AHPALTRAAG 


601 


PARNGGTAFL 


FTGQGSQRPG 


MGRQLYDTFD 


VFAESLDETC 


ARLDPLLEQP 


651 


LKPVLFAPAD 


TAQAAVXjHGT 


GMTQAALFAL 


EVALYRQVTS 


FGIAPSHLTG 


701 


HSVGEIAAAH 


VAGVFSLADA 


CTLVAARGRL 


MQALPAGGAM 


LtAVQAAEDDV 


751 


LPLLiAGQEER 


LSLAAVNGPT 


AVWSGEAAA 


VGEVEKALRG 


RGLKTKRLNV 


801 


SHAFHS PL I E 


PML.DDFREVA 


RGLTFHAPTL 


PWSNLTGRL 


ADAELMADAE 


851 


YWVRHVRRPV 


RFHDGLRALS 


EQGWRYLEL 


GPDPVLATMV 


QDGLPAPAEG 


901 


EEPEPWAAA 


LRSKHDEGRT 


LLGAVAALHT 


DGQ PAD LTALt 


FPADAGQVPL 


951 


PTYRFQRRRY 


WRVAPDAAAP 


ARAAGLQETG 


HPLLPAVIRQ 


ADGGILLAGR 


1001 


LiSLRTHPWLiA DHTIAGGVPL PATAFVELAL LAGRHAACDT IDDLTLETPL, 



1051 LLDDTGTGVG AAVGAGADAL VDAIEVQLAL GAPDGSGRRA LTVHSRPADD 
1101 AADDGDAADA ADAAGRGGPG GSGDLGDPGD PGDLGDGGGS RGWRRHATG I 
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1151 


LSAGPAAEPA 


APDAAPWPPA 


DATALDVDAL 


YARLDAQGYS 


YGPAFRAVHA 


1201 


AWRHGDDLYA 


DVRLADEQRA 


EADAFALHPA 


LLDAALHAVD 


ELYRGSEGRG 


1251 


QEQGQGGQEP 


EQGRGDADAP 


VRLPFSFSDI 


RHHATGATRL 


WVRLSPQGDD 


1301 


RLRLSLTDGE 


GGQVATVDAL 


QLRLI PADRW 


RAARPTTAAP 


LYHLDWHELP 


1351 


LPEPAETDPA 


AH S WAVLGAH 


DAGLAPAAHY 


PDLAALKAAV 


EAGEPVPDIV 


1401 


FAP FPAQGTE 


TDVPAQVRAH 


ARHALELLRD 


WliTTEAFAAA "kLVVLTTGAV 


1451 


TARPEDGPAD 


LATAPVWGLV 


RAAQAEQPDH 


WL.VD I DKD I 


DKJJ 1 Uh.hi 1 Jjy 


1501 


ATDAGTASRH 


ALPAALAAAA AQAETQLALR 


■» mt rr t 7T\ri T "A 

AGTVLVPRIoA 


VVFPKiLllrA 


1551- 


LHATAPESTT 


DTVDS TGI AG 


AAESGGTVLI 


TGGTGGLGQA 


VARHLiAAAHG 


1601 


ARHLLLVSRR 


GDAAEGVAEL 


RAD1ADDGVD 


VRVAACD I TD 


RDALAGLLAD 


1651 


IPAAHPLTAV 


VHTAGVIDDS 


LITAMTPERL 


DAVLAP KADA 


AWHLHELTRD 


1701 


KDLSAFVLFS 


SGASVLGNGG 


QANYAAANTF 


LNTLAEHRRA 


AGLAAT S VAW 


1751 


GliWESASGGM 


AARLGDADRA 


RIHRTGVTGL 


TDEQALALFD 


AALTAEHPTV 


1801 


LATRFDRAVL 


RGQAAARTLQ 


PALRGLVRTP 


RPTASAGAIG 


STAATGSATD 


1851 


ENAPS SWAAR 


LARLSAADRD 


RALNEL.IREQ 


IATVLAHPSP 


DTIELGRAFQ 


1901 


ELGFDSLiTAL 


ELRNRLSTAT 


GIRLPATLVF 


DHPS PTALVR 


HLHSHLPDEA 


1951 


QHTS PTAPGA 


SAEGTAATAT 


GIDDDPIAIV 


GMACRYPGGV 


TSPEQLWQLV 


2001 


ATGTDAIGPF 


PEDRGWDTAG 


LFDPDPDQVG 


HSYTREGGFL 


YDAARFDAGF 


2051 


FG IS PRE AAA 


TDPQQRLLLE 


TAWQAFEHAG 


IDPAALiRGTP 


CGVITGIMYD 


2101 


DYGSRFLiARK 


PDGFEGRIMT 


GSTPSVASGR 


VAYTFGLEGP 


AITVDTACSS 


2151 


SLVAMHLAAQ 


ALRQGECELA 


LAGGVTVMAT 


PNTFVEFSRQ 


RGLAPDGRCK 


2201 


PFAAAADGTG 


WGEGAGLWL 


ERLSDARRKG 


HRVLALLRGS 


AVNQDGASNG 


2251 


MTAPNGPSQE 


RVIRTALAGA 


GRGPEDIDW 


EAHGTGTTLG 


DPI E AQ ALIiA 


2301 


TYGQGRPEDR 


PLWLGSVKSN 


I GHTQAAAG V 


AG V I KMVMAL 


RHEQLPTTLH 
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1 1 SEPTEMBER 2000 



2351 


ADEPTPHVQW 


DGGGVRLLTE 


PVPWSRGERT 


RRAGVSSFGI 


SGTNAHLILE 


2401 


EPPEEDLPEP 


VAAE PGGWP 


WWSGRTPDA 


LREQARRLGE 


FWGAGDVSA 


2451 


AEVGWSLATT 


RSVFEHRAW 


AGRDRDDLVA 


GMQALAAGET 


PTDWSGAAA 


2501 


SSGAGPVLVF 


PGQGSQWVGM 


GAQLLDES PV 


FAARIAECEQ 


ALSAYVDWSL 


2551 


SDVLRGDGSE 


LSRVEWQPV 


LWAVMVSLAA 


VWADYGVTPA 


AWGHSQGEM 


2601 


AAACVAGALS 


LEDAARIVAV 


RSDALRQLQG 


HGDMASLGTG 


AEQAAELIGD 


2651 


R P GVWAAVN 


GPSSTVISGP 


PEHVAAWAE 


AEARGLRARV 


IDVGYASHGP 


2701 


QIDQLHDLLT 


EGLADIRPAN 


TDVAFYSTVT 


AERLTDTTALi 


DTDYWVTNLR 


2751 


QPVRFADTIE 


ALLADGYRLF 


I EASAHP VLG 


LGMEETIEQA 


DIPATVVPTL 


2801 


RRDHGDTTQIj 


TRAAAHAFTA 


GAD VDWRRW F 


PADPTPRTVD 


LPTYAFQHQH 


2851 


YWLtEEPSGLT 


GDAADLGMVA 


AGHPLLGACV 


ELAESDSYLF 


TGRIjSRRAPS 


2901 


WLAEHWAGT 


VLVPGAALVE 


WVLRAGDEAG 


CPTIEELTLQ 


A PL VL PES GG 


2951 


LQVQVWGAT 


DEQSGRRDVH 


VYSRSEQDAS 


AVWVCHAVGV 


VSSEMPEAAA 


3001 


ELSGQWPPAG 


AEAVDVEDFY 


ARAAEAGYAY 


GPAFQGLRAL 


WRHGTELFAE 


3051 


WLPEQAGGH 


DGFGIHPALL 


DAALHPLMLL 


DRPADGQMWL 


PFAWSGVSLN 


3101 


ADRATHVRVR 


LiS PRGEAAER 


DLRWIADAT 


GAPVLTVDAL 


TLRAADPGRL 


3151 


GAAARGGVDG 


LYTVDWTPLP 


LPQPLPLPRT 


DAGGSADWVI 


LSDNSSAALA 


3201 


DAVSSATAAG 


GGAPWALLAP 


VGGGSADDGL 


PWRRTLSLV 


QEFLAAPELT 


3251 


ESRLVIVTRG 


AVATDADGDV 


AASAAAVWGL 


IRSAQSENPG 


RFVLLDVEEE 


3301 


HLHPDGGELP 


YAALRHAVEE 


LDEPQLALiRS 


GKFLVPRMTP 


AAAPEELVPP 


3351 


VGTSGWRLGT 


SGTATLENLS 


VIDAPEAFAP 


LEPGQVRISV 


RAAGMNFRDV 


3401 


LIALGMYPDK 


GTFAGSEGAG 


HVTEVGPGVT 


HLSVGDRVMG 


LFEGAFAPLA 


3451 


VADARMWPI 


PEGWSFQEAA AVPWFLTAW 


YGLVDLGRLR 


AGESLLIHAG 


3501 


TGGVGMAATQ 


IARHLGAEVF 


ATAS PAKHGV 


LDGMGIDAAH 


RASSRDLDFE 
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1 1 SEPTEMBER 2G00 



3551 


ETLRAATGGR 


GMD WLNS LA 


GEFTDASLRL 


LAEGGRMVDM 


GKTDKRDPDR 


3601 


VAAEHAGAWY 


RAFDLVPHAG 


PDRIGEMLAE 


LGELFASGAL 


APLPVQTWPL 


3651 


GRAREAFRFM 


SQAKHTGKLV 


LEIPPALDPD 


GTVLITGGTG 


VLAAAVAEHL 


3701 


VREWGVRHLL 


LAGRRGSEAP 


GSSELAEELT 


ELGAEVTFAA 


ADVSDPDAVA 


3751 


ELVGKTDPAH 


PLTGVIHAAG 


VLDDAWTAQ 


TPESLARVWA 


AKATAAHLLH 


3801 


EATREARLGL 


FLVFSSAAAT 


LGS PGQANYA 


AAWAYCDALV T^QRRAEGLAG 


3851 


LSIGWGLWQT 


ASGMTGHLGE 


TDLARMKRTG 


FTPLTTEGGL 


ALLDAARAHG 


3901 


R PHWAVDLD 


ARAVAAQPAP 


SRPALLRAIiA 


AGATPGARTA 


RRTAAAGSVA 


3951 


PAGGIiADRLA 


GLPHPERRRL 


LLDLVRGNVA 


GVLGHSDHDA 


VRPDTSFKEL 


4001 


GFDSLTAVEL 


RNRLAAATGL 


KLPAAL.VFDY 


PESATLVDHL 


LERLSPDGAP 


4051 


PPVKDAADPV 


LNDLGRIESS 


LDALALDADA 


RSRVTRRLNT 


LLSKLNGAAT 


4101 


AG S PAD VTDL 


DALDALtDDVS 


DDEMFEFIDR 


EL* 





MonAIV, polyketide synthase multi-enzyme MONS4, housing extension 
modules 5 and 6 Length: 4039 amino acids 

1 MSSAEESSPD VSGTGVSGTG ESATGTSSTE AKLRQYLKRV TVDLGQARRR 



51 


LREVEERAQE 


PIAIVSMACR 


FPGDTRTPEA 


LWDLVAEGGD 


AIDDFPTNRG 


101 


WDLESLYHPD 


PDHPGTSYVR 


RGGFLYDAPA 


FDASFFGISP 


REAIiAMDPQQ 


151 


RVLMETAWQL 


LERAGIDPAS 


LKLSATGVYI 


GAGVLGFGGA 


QPDKTVEGHL 


201 


LTGSALSVLS 


GRI S FTLGLE 


GPSVSVDTAC 


SSSLVSMHlxA 


AQALRQGECD 


251 


LAIxAGGVTVM 


STPGAFTEFS 


RQGALSPDGR 


SKAFAASADG 


TGFSEGAGLL 


301 


LLERLSDARR 


NGHKVLAVIR 


GSAVNQDGAS 


NGLTAPNGPS 


QERVI RAA1*A 


351 


NAGLGAAEVD 


AVEAHGTGTK 


LGDP I EAGAL 


LATYGRDRDE 


DRPLWLGSVK 


401 


SNIGHPQGAA 


GVAGVIKMVM 


ALQRELLPAT 


L*YVDE PTPHV 


DWSSGSVRLL, 


451 


TEPVPWTRGE 


RPRRAGVSAF 


GMSGTNAHVI 


LEEAPPEEAA 


AAET PAEGTG 


501 


AWPWWSGR 


GEEALRAQAA 


QLAEHVRDDD 


QRPASPLEVG 


WSLATTRSVF 
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iT 


tV./GE 

1 i Si? 


: 6 OW2-0 

'TF^P.rn 20G0 


551 


ENRAWVGDD 


RDALLDGLRS 


LAAGEASPDV 


VSGAVGPTGP 


GPVMVFPGOG 


601 


GQWVGMGARL 


LDES PVFAAR 


IAECEQALSA 


YVDWSLtTDVL 


RGDGSELARI 


651 


DWQPVLWAV 


MVALAAVWAD 


QGIEPAAWG 


HSQGEIAAAC 


WGAISLDEA 


701 


ARIVAVRSVL 


LRQLSGRGGM 


ASLGMGQEQA 


ADLIDGHPGV 


WAAVNGPSS 


751 


TVISGPPEGI 


AAWADAQER 


GLRARAVASD 


VAGHGPQLDA 


ILDQLTEGLA 


801 


G I RPAATDVA 


FYSTVTAGHL 


TDTTELDTAY 


WVRNVRRTVR 


FADTIDALLA 


851 


DGYRLFIEVS 


PHPVLNLALE 


GLI ERAAVPA 


TWPTLRRDH 


GDTTQLARAA 


901 


AHAFAAGADV 


DWRRWFPADP 


APRTVDLPTY 


AFQRQDFWPA 


PAGGRSGDPA 


951 


GLGIaAASGHP 


liLGASVGLAS 


GDVHLLiSGRV 


SRQSAAWLDD 


HWAGQALVP 


1001 


GAAQVEWVLR 


AGDDAGCSAL 


EELTLQTPLV 


LPDTGGLRIQ 


WVEAADAHG 


1051 


RRDVRLFSRP 


DDDDAFASTH 


PWTCHATGVL 


APAPTDGTNG 


TRDAADTLDG 


1101 


AWPPADAEPV 


PADDLYAQAD 


RTGYGYGPAF 


RGVRALWRHG 


KDVTjAEVTLP 


1151 


KEAGDPDGFG 


IHPALLDAVL 


QPAALLLPPT 


DAEQVWLPFA 


WNDVAL.HAVR 


1201 


ATTVRVRLTP 


LGER I DQGLR 


I TVADAVGAP 


VLTVRDLRSR 


PTDTGRLiAAA 


1251 


ATRDRHGLFD 


LEWIAPENAA 


ENAAG PARDA 


SEGWVTLGED 


AAS LADLiLtAS 


1301 


VEAGAPAPQL 


VAAPVEPDRT 


DDGLALATHV 


LDLVQTWLAS 


PLHDSRLVLV 


1351 


TRGAVTDADV 


DVAAAAVWGL 


VRSAQSEHPG 


RFTLIDLGPD 


DTLAAAMQAA 


1401 


HLEEPQLAVH 


GGEIRVPRLV 


RATTDPTAPN 


GTPEADRTAD 


PS EGLHRNGT 


1451 


VT, 1 TGGTG VL 


GRLVAEHLVT 


EWGVRHLLLA 


SRRGDQAPGS 


AELRARLSEL 


1501 


GAS VE I APAD 


VGDAEAVAAL 


IASVDPAHPL 


TGVI HAAGVL 


DDAVI TAQTP 


1551 


ESLARVWATK 


ATAARHLHEA 


TRETPLDFFV 


VFSSAAASLG 


SPGQANYAAA 


1601 


NAYCDALVQH 


RRAQGLAGLS 


IAWGLWQATS 


GMTGQLSETD 


LARMKRTGFA 


1651 


ALTDEGGLAL 


LiDAARAHDRA 


YWAADLDPR 


AVTDGLiSPLL 


RALTAPATRR 


1701 


RVASEGLADG 


AliATRLAGLD 


ADGRLRLLTD 


WREYVAAVXj 


GHGSAARVGV 
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1 1 SEPTEMBER 2G03 



1751 


DIAFKDLGFD 


1 SLTAVELRNR 


. LSAACDVRLP 


' ATH FDHPTP 


QALATHLVDR 


1801 


LiAGSTSATTT 


VNATAPAAAH 


VAAGADVDAD 


» TDD PVAIVAM 


TCRFPGGVAS 


1851 


PDDLWDL.LDA 


RKDAMGAFPT 


DRGWDLERLF 


HPDPDHPGTS 


YTDQGGFLPD 


1901 


AGDFDAAFFG 


I NPREAIiAMD 


PQQRLLLEAS 


WE VLiERAG I D 


PTTLKGTPTG 


1951 


TYVGLMYHDY 


AKSFPTADAQ 


LEGYSYLAST 


GSMVSGRVAY 


TLGLEGPAVT 


2001 


VDTACSSSLV 


S I HIiATQALR 


HGECDLALAG 


GVTVMADPDM *F AG F S RQ RG L 


2 05_1 


SPDGRCKAYA 


AAADGVGFSE 


GVGVLIjIiERL 


SDARRHGRRV 


LGWRGSAVN 


2101 


QDGASNGLTA 


PNGPSQERVI 


RQALASGGLS 


SVDVDWEGH 


GTGTTLGDPI 


2151 


EAQALLATYG 


QGRPEDRPLW 


LGSVKSNIGH 


TQAAAGVAGV 


IKMVMAMRHG 


2201 


WPASLHVDV 


PSPHVEWDSG 


AVRLAVESVP 


WPQVEGRPRR 


AGVSSFGASG 


2251 


TNAHVIVESV 


PDGLEEDSVS 


VGGEALETET 


DGRLVPWWS 


ARSPQALRDQ 


2301 


ALRLRDFASD 


AS FRAPLADV 


GWSLLKTRAL 


HEHRAVWGA 


ERAELIAALE 


2351 


ALATGEPHAA 


LVGPACSQAR 


VGGDDWWLF 


SGQGSQLVGM 


GAGLYERFPV 


2401 


FAAAFDEVCG 


LLEGPLGVEA 


GGLREWFRG 


PRERLDHTVW 


AQAGLFALQV 


2451 


GIxARLWESVG 


VRPDWLGHS 


IGEIAAAHVA 


GVFDLADACR 


WGARARLMG 


2501 


GLPEGGAMCA 


VQATPAELAA 


DVDGSAVSVA 


AVNTPDSTVI 


SGPSDEVDRI 


v 2551 


AGVWRERGRK 


TKALSVSHAF 


HSALMEPMLA 


EFTEAIRGVK 


FRQPSIPIiMS 


2601 


NTVSGERAGEE 


ITDPEYWARH 


VRNAVLFQ PA 


IAQVADSAGV 


FVELGPAPVL 


2651 


TTAAQHTLDE 


SDSQESVLVA 


SLAGERPEES 


AFVEAMARLH 


TAGVAVDWSV 


2701 


LFAGDRVPGL 


VELPTYAFQR 


ERFWLSGRSG 


GGDAATLGLV 


AAGHPLLGAA 


2751 


VEFADRGGCL 


LTGRLSRSGV 


S WLADHWAG 


AVLVPGAALV 


EWALRAGDEV 


2801 


GCVTVEELML 


QAPLWPEAS 


GLRVQWVEE 


AGEDGRRGVQ 


IYSRPDADAV 


2851 


GGDDSWI CHA 


TGVLSPESAR 


LDTELGGVWP 


PAGAEPLDVD 


GFYAQAGEAG 


2901 


YGYGPAFRGL 


RAVWRHGQDL 


LAEWLPEAA 


GAHDGYG I HP . 


ALLDATLHPL 
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2951 


LAARFMDGSE 


DDQLYVPFGW 


AGVS DRA VGA 


TTVRVRLRPV 


GESVDQGLSV 


3001 


TVTDATGGPV 


LSVDSLQTRP 


VKPSQLAAAQ 


QPDVRGLFTV 


EWTPLPQTDA 


3051 


DGEADWWXiS 


DGVGRLADW 


SAAGGEAPWA 


WAPVDASVG 


DGREGLDGRL 


3101 


WERVLSLVQ 


EFLALPELAE 


SRLLWTRGA 


VATGVDGDGD 


VDASAAAVWG 


3151 


LVRSAQSENP 


GRFILLDVDG 


DGDDQG PDLN 


GRHLPHATLR 


HAAEELDEPQ 


3201 


LALREGTLYV 


PRLTQARQSA 


ELWPPGEPA 


WRLRMVHDGS 


LDALAAVACP 


3251 


EALEPLAPGQ 


VRIAVHAAGI 


NFRDVLVALG 


MVPAYGAMGG 


EGAGWTEVG 


3301 


PEVTHVSVGD 


RVMGVFEGAF 


GPWIAEARM 


VTPVPQGWDM 


REAAG I PAAF 


3351 


LTAWYGLVEL 


AGLKAGERVL 


VHAATGGVGM 


AAVQIARHVG 


AEVFATASPG 


3401 


KHAVLEEMGI 


DAAHRASSRD 


LAFEGTFREA 


TGGRGMDWL 


NSLAGEFIDA 


3451 


SLRLLGDGGR 


FLEMGKTDVR 


AAE E VAAEHA 


DVSYTAYDLV 


GDAGPDR I SN 


3501 


MLDKLVELFA 


SERLKPLPVR 


SWPLDKAQEA 


FRFMSQAKHT 


GKLVLEIPPA 


3551 


LDPEGTVLVT 


GGTGALiGQW 


AEHLVREWGV 


RHLLLASRRG 


PEAPGSDELA 


3601 


SKLTGLGAEV 


TIVAADVSDP 


ASWELVGKT 


DPSHPLTGW 


HAAGVLEDGV 


3 6 51 


VTAQTPEGLA 


RVWAAKAAAA 


ANLHEATREM 


RLGLFWFSS 


AAATLGS PGQ 


3701 


ANYAAANAYC 


DALMQHRRAV 


GQVGLSVGWG 






3751 


TVGKASALSD 


GTNGSAPQDT 


TGTAPQGMTG 


GLTDTDVARM 


ARIGVKGMSN 


3801 


AHGLALFDAA 


HRHGRPHLVG 


FNLDLRTLAT 


HPLHTRPALL 


RGLATPTAGG 


3851 


AS R PTATAGG 


QPADLAGRLA 


ALS PSDRHHT 


LVRLI REQAA 


TVLGHHPDSL 


3901 


TTGSTFKELG 


FDSLTAVELR 


NRLSAATGLR 


LPAGLVFDHP 


DADILAEHLG 


3951 


AQLAPDGDTP 


AGAEATDPVL 


RDLAKLENAL 


SSTLVEHLDA 


DAVTARLEAL 


4001 


LSNWKAASAA 


PGSGSTKEQL 


QVATTDQVU3 


FIDKELGV* 





Mon AV, polyketide synthase multi-enzyme MONS5, housing extension 
modules 7 and 8 Length: 4107 amino acids 
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5UBST1TU7E SHEET (RULE 26) 



1 1 SFFTFIwnro 2000 



1 


MAS EEELVDY 


LKRVAAELHD 


TRQRLREVED 


RRQEPVAWG 


MACRFPGGIE 


51 


TPEGLWELVA 


AGDDAIEPFP 


TDRGWDLEGI 


YHPDPDHPGT 


CYVREGGFLA 


101 


APDRFDSDFF 


GFS PREALAS 


SPQLRLLLET 


SWEALERAGI 


NPASLKGSPT 


151 


GVYVGAATTG 


NQTQGDPGGK 


ATEGYAGTAP 


SVLiSGRLSFT 


LGLEGPAVTV 


201 


ETACSSSLVA 


MHLAANALRQ 


GECDLALiAGG 


VTVMSTPEVF 


TGFSRQRGLA 


251 


PDGRCKPFAA 


AADGTGWGEG 


AGLILLERLS 


DARRKGHKVL 


AVIRGSAINQ 


301 


DGASNGFTAP 


NGPSQRRVIR 


QALSSAHLST 


SEIDWEAHG 


TGTRLGDPIE 


351 


AEALIATYGK 


ereddrplwiT GSVKSNIGHT 


QAAAGVAGVI 


KMVMALQREL 


401 


Ij patlnvde p 


TPHVQWEGGG 


VRLLTEPVPW 


SRGERPRRAG 


ISSFGISGTN 


451 


AHWLEEAPP 


EEDVPGPVAA 


EPEGWPWW 


SARTEEALSE 


QARRLGE FVA 


501 


DTDPSTADVG 


WSLTTSRAIL 


EHRAVWGRD 


RDALTAGLAA 


LAAGEESADV 


551 


VAGVAGDVGP 


GPVLVFPGQG 


SQWVGMGAQL 


LDESPVFAAR 


IAECEQALSA 


601 


YVDWSLSAVL 


RGDGSEL'SRV 


EWQPVLWAV 


MVSIiAAVWAD 


YGVTPAAVIG 


651 


HSQGEMAAAC 


VAGALiS LEDA 


ARVVAVRSDA 


LRQLMGQGDM 


ASLGASSEQA 


701 


AELIGDRPGV 


CIAAVNGPSS 


TVISGPPEHV 


AAWADAEER 


GLiRARVI DVG 


751 


YASHGPQIDQ 


LHDLLTDRLA 


DIRPATTDVA 


FYSTVTAERL 


TDTTALDTDY 


801 


WVTNLRQPVR 


FADT I DAL LA 


DGYRLFIEAS 


AHPVLGLGME 


ETIEQADXPA 


851 


TW P TLRRDH 


GDTTQLTRAA 


AKAFTAGATV 


JJWKKWr PADP 


TPRTIDLPTY 1 


901 


AFQRRSYWLP 


VDGVGDVRSA 


GLRRVEHSLL 


PAALGLADGA 


LVLTGRLAAS 


951 


GGGGGWLADH . 


AVAGTTLVPG 


AALVEWALRA 


ADEAGCPSLE 


ELTLQAPLVL 


1001 


PGSGGLQVQV 


WGPADGQGG 


RREVRVFSRV 


DSDDEAAGQD 


EGWSCHATGV 


1051 


LSPEPGAVPD 


GLSGQWPPTG 


AEPLEISDLY 


EQAASAGYEY 


GPSFRGLRSV 


1101 


WRHGHNLLAE 


VEL PEQAG AH 


DDFGIHPVLL 


DAALHPALLL 


DQNAPGEEQE 


1151 


PAQPALRLPF 


VWNGVSLWAT 


GAATVRVRLA 


PHGGGETDDS 


AG LRVTVADA 
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SUBSTUUTe SHEET (RULE 26) 



1 1 septet:: 2000 



1201 


TGAPVLSVDS 


LALRPADPEL 


LRTAGRAGSG 


TNGLFTVEWT 


ALPPADVADH 


1251 


AAGDGWAVLG 


QDVPDWAGAD 


MPRHPDMASL 


SAALDEGTQA 


PAAVFVETTA 


1301 


TSHATPNTAA 


DVTLDASGRA 


VAERTLHLLR 


DWLAE PRLAE 


TRLVLITHHA 


1351 


VTTPADDDVN 


AAPLDVPAAA 


LWGLIRSAQA 


EHPDRFVLLD 


TDAKANTDPG 


1401 


PDTSTDHSTA 


SGTYRTVIAR 


ALATGEPQLA 


VRAGELLAPR 


LARAATPTPE 


1451 


TPTPETQPDT 


GSGSEAGAGS 


GSGPGATLiDP 


DGTVLIAGGT 


GMMGGLVAEH 


1501 


LVRAWSVRHL 


LLVSRQGPDA 


PDARDLADRL 


VGJjGATVR I V 


AADLTDGRAT 


1551 


ADLVASVDPA 


HPLTGVIHAA 


GVLDDAWTA 


QTSDQLARVW 


AAKASVAANL 


1601 


DAATSELPLG 


LFLMFSSAAG 


VLGNAGQAGY 


AAANAFVDAL 


VGRRRATGLP 


1651 


GLSIAWGLWA 


RGSAMTRHLD 


DADLARLRAG 


GVKPLLDEQG 


LALLDAARAT 


1701 


AAHTSLWAA 


GIDVRGLNRD 


DVPAILRDLA 


GRTRRRAAAD 


STVDQAALER 


1751 


RLTGLDEAER 


RAWTDWRE 


CVAAVLGHRS 


AADVRTEANF 


KDLGFDSLTA 


1801 


VQLRNRLSAA 


SGLRLPATLA 


FDHPTPQALA 


AYLGTRLSGR 


TATPVAPVAP 


1851 


SAAATDEPVA 


IVAMACKYPG 


GATS PEGLWD 


LVAEGVDAVG 


AFPTGRGWDL 


1901 


ERLFHPDPDH 


PGTS YADEGA 


FLPDAGDFDA 


AFFGINPREA 


LAMDPQQRLIi 


1951 


LEASWEVLER 


AGIDPTTLKG 


TPTGTYVGVM 


YHDYAAGLAQ 


DAQLEGYSML 


2001 


AGSGSWSGR 


VAYTLGLEGP 


AVTVDTACSS 


SLVSIHLAAQ 


AXiRQGECTLA 


2051 


liAGGVTVMAT 


PEVFTGFSRQ 


RGLAPDGRCK 


PFAAAADGTG 


WGEGVGVTjLLi 


2101 


ERLSDARRHG 


RRVLGWRGS 


AVNQDGASNG 


LTAPNGPSQE 


RVIRQAliASG 


2151 


GLSSVDVDW 


EGHGTGTTLiG 


DPIEAQALLA 


TYGQGRPVDR 


PLWLGSVKSN 


2201 


I GHTQAAAGV 


AGVIKMVMAM 


RHGWPASLH 


VDVPSPHVEW 


DSGAVRLAVE 


2251 


SVPWPEVEGR 


PRRAGVSSFG 


ASGTNAHVIV 


ESVPDGLGED 


SVSVSGEAPE 


2301 


TETDGRLVPW 


WSARSPQAL. 


RDQALRLRDA 


VAADSTVSVQ 


DVGWSLLKTR 


2351 


ALFEQRAVW 


GRERAELL.SG 


LAVLAAGEEH 


PAVTRSREDG 


VAASGAWWL 
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SUBSTITUTg SHEET (PX'lE 26) 



1 1 SEPTEMBER 2000 



2401 


FSGQGSQLVG 


MGAGLYERFP 


VFAAAFDEVC 


GLLEGPLGVE 


AGGLREWFR 


2451 


GPRERLDHTM 


WAQAGLFALQ 


VGLARLWESV 


GVRPDWLGH 


SIGEIAAAHV 


2501 


AGVFDLADAC 


RWGARARLM 


GGL PEGGAMC 


AVQATPAELA 


ADVDDSGVSV 


2551 


AAVNTPDSTV 


ISGPSGEVDR 


IAGVWRERGR 


KTKALSVSHA 


FHSALMEPML 


2601 


AEFTEAIREV 


KFTRPKVSLI 


SNVSGLEAGE 


EIASPEYWAR 


HVRQTVLFQP 


2651 


G I AQVAS TAG 


VFVELGPGPV 


LTTAAQHTLD DVTDRHGPEP VLVSSLAGER 


27G.1 


PEESAFVEAM 


ARLHTAGVAV 


DWSVL.FAGDR 


VPGLVELPTY 


AFQRERFWLS 


2751 


GRSGGGDAAT 


LGLVAAGHPL 


LGAAVEFADR 


GGCLLTGRLS 


RSGVSWLADH 


2801 


WAGAVLVPG 


AALVEWALRA 


GDEVGCVTVE 


ELMLQAPLW 


PEASGLRVQV 


2851 


WE E AGEDGR 


RGVQIYSRPD 


ADAVSGDDSW 


I CHATGTLTP 


QHTDAPNDGL 


2901 


AGAWPAAGAV 


PVDLAGFYER 


VADAGYAYGP 


GFQGLRAVWR 


HGQDLiLAEW 


2951 


liPEAAGAHDG 


YG I H PALLDA 


TLHPALLLDW 


PGEVQDDDGK 


VWLPFTWNQV 


3001 


SLRAAGAATV 


RVRLSPGEHD 


EAEREVQVLV ADATGTDVLS 


VGSVTLRPAD 


3051 


I RQLQAVPGH 


DDGLFSVDWT 


PLPLSRTDVS 


QTDADGDADW 


WLSDGVGSL 


3101 


ADWSAAGGE 


APWAWAPVG 


ASAGGGLAGF 


DRREGLDGRL 


WERVLSLVQ 


3151 


EFLAAPELiAE 


SRL.LVLTRGA 


VATGGDGDGD 


VDASAAAVWG 


LVRSAQSENP 


3201 


GRFILLDVDM 


DVDVDVDMDV 


DVDVDVDVDV 


DGDGNGS DliD 


PDLNGRRLPH 


3251 


ATLRHAAEEL 


DEPQLALRDG 


QLLVPRLVRA 


TGGGLWAPT 


DRAWRLDKGS 


3301 


AETLESVAPV 


AYPGVMEPLG 


PGQVRLG I HA 


AGINFRDVLV 


SLGMVPGQVG 


3351 


LiGGEGAGWT 


ETGPDVTHIjS 


VGDRVMGVLH 


GSFGPTAVAD 


TRMVAPVPQG 


3401 


WDMRQAAAMP 


VAYLTAWYGL 


VELAGLKAGE 


RVLIHAATGG 


VGMAAVQ I AR 


3451 


HLiGAEVFATA 


SAAKHWLEE 


MG I DAAHRAS 


SRDLAFEDTF 


ROATDGRGMD 


3501 


WLMSLTGEF 


IDASLRLLGD 


GGRFLEMGKT 


DVRTPEEVAA 


EYPGVTYTVY 


3551 


DLVTDAGPDR 


IAVMMSELGE 


RFASGALDPL 


PVRSWPLDKA 


REAFRFMSQA 
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SUBSTJ f U : fc ii-vicT (RULE 26) 



1 1 SEPTEMBER 2000 



3601 


KHTGKLVLDV 


PAPLDPDGTV 


LITGGTGALG 


QWAEHLVRE 


WGVRHLLLAS 


3651 


RRGLDAPGSG 


ELADRLSDLG 


AEVTVAAADV 


SDPASWEIjV 


GKTDPSHPLT 


3701 


GWHMGVLE 


DGIVTAQTPE 


GLARVWAAKA 


AAAANLHEAT 


REMRLGLFW 


3751 


FSSAAATLGS 


PGOANYAAAN 


AYCDALMQRR 


RAAGQVGLSV 


GWGLWEAPDA 


3801 


KPGVAADAKP 


DVAADAKTGV 


AADGTPQGMT 


GTLSGTDVAR 


MAR I GVKAMT 


3 851 


SAHGLALiLDA 


AHRHGRPHLV 


AVDLDTRVLA 


HKPAPALPAL 


LRAFAGDQGG 


3901 


QGGGRGGGRG 


GGPARPAAAT 


TRQNVDWAAK 


LSVLTAEEQH 


RTLL.DLVRTH 


3951 


AAAVLGHAGT 


DAVRADAAFQ 


DLGFDSLTAV 


ELRNRLSAST 


GLRLPATFIF 


4001 


RHPTPSAIAD 


ELRAQLAPAG 


ADPAAPLFGE 


LDKLETVITG 


HAHDESTRTR 


4051 


LAARLQNLL.W 


RLDDTSARSD 


HAAGASDADG 


DAVENRDLES 


ASDDELFELI 


4101 


DRELPS* 











MonAVI, polyketide synthase multi-enzyme MONS6, housing extension 



module 9 Length: 


1701 amino acids 






1 


MPGTNDMPGT 


EDKLiRKYLKR 


VTADLGQTRQ 


RLRDVEERQR 


EPIAIVAMAC 


51 


RYPGGVASPE 


QLWDLVASRG 


DAIEEFPADR 


GWDVAGLYHP 


DPDHPGTTYV 


101 


REAGFLRDAA 


RFDADFFGIN 


PREALAADPQ 


QRVLLEVSWE 


LiFERAG I DPA 


151 


TLKDTLTGVY 


AGVSSQDHMS 


GSRVPPEVEG 


YATTGTLSSV 


I SGRI AYTFG 


201 


LEGPAVTLDT 


ACSASLVAIH 


LACQALRQGD 


CGLAVAGGVT 


VLSTPTAFVE 


251 


FSRQRGLAPD 


GRCKPFAEAA 


DGTGFSEGVG 


LILLERLSDA 


RRNGHQVLGV 


301 


VRGSAVNQDG 


ASNGLTAPND 


VAQERV I RQA 


LTNARVTPDA 


VDAVEAHGTG 


351 


TTLGDPIEGN 


AL.LATYGKDR 


PADRPLWtiGS 


VKSNIGHTQA 


AAGVAGVIKM 


401 


VMAMRHGELP 


ASliHIDRPTP 


HVDW EGGGVR 


LLTDPVPWPR 


ADRPRRAGVS 


451 


SFGISGTNAH 


LIVEQAPAPP 


DTADDAPEGA 


ATPGASDGLV 


VPWWSARSP 


501 


QALRDQALRL 


RDFAGDASRA 


PLTDVGWSLL 


RSRALiFEQRA 


WAGRERAEL 


551 


LAGLAALAAG 


EEHPAVTRSR 


EEAAVAASGD 


WWLFSGQGS 


QLVGMGAGLY 
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1 1 SEPTEMBER 2000 

601 ERFPVFAAAF DEVCGLLEGE LGVGSGGLRE WFVJGPRERL DHTVWAQAGL 

651 FALQVGLARL WESVGVRPDV VLGHSIGEIA AAHVAGVFDL ADACRWGAR 

701 ARLMGGLPEG GAMCAVQATP AELAADVDGS SVSVAAVNTP DSTVISGPSG 

751 EVDRIAGVWR ERGRKTKALS VSHAFHSALM BPMLGEFTEA IRGVKFRQPS 

B01 IPLMSNVSGE RAGEEITSPE YWARHVRQTV LFQPGVAQVA AEARAFVELG 

851 PGPVKTAAAQ HTLDHITEPE GPEPWTASL HPDRPDDVAF AHAMADLHVA 

901 GISVDWSAYF PDDPAPRTVD LPTYAFQGRR FWLADIAAPE AVSSTDGEEA 

951 GFWAAVEGAD FQALCDTLHlf KDDEHRAALE TVFPALSAWR RERRERSIVD 



1001 


AWRYRVDWRR 


VELPTPVPGA 


GTGPDADTGL 


GAWL I VAPTH 


GSGTWPQACA 


1051 


RALEEAGAPV 


RIVEAGPHAD 


RADMADLVQA 


WRASCADDTT 


QLGGVLSLLA 


1101 


LAEAPATSSD 


TTSHTSTSCG 


TGSLASHGLT 


GTLTLLHGLL 


DAGVEAPLWC 


1151 


ATRGAVS CGD 


ADPLVSPSQA 


PVWGIX5RVAA 


LEHPELWGGL 


VDLPADPESL 


1201 


DASALYAVLR 


GDGGEDQVAL, 


RRGAVLGRRL 


VPDATPDVAP 


GSSPDVSGGA 


1251 


AHADATSGEW 


QPHGAVIiVTG 


GVGHLADQW 


RWLAASGAEH 


WLLDTGPAN 


1301 


SRGPGRNDDL 


AAEAAEHGTE 


LTVLRSLSEL 


TDVSVRPIRT 


VIHTSLPGEL 


1351 


APLAEVTPDA 


LGAAVSAAAR 


LSELPGIGSV 


ETVLFFSSVT 


ASLGSREHGA 


1401 


YAAANAYLDA 


LAQRAGADAA 


SPRTVSVGWG 


I WDL PDDGDV 


ARGAAGLSRR 


1451 


QGLPPLEPQL 


ALGALiRAAliD 


GGKGHTLVAD 


IEWERFAPLF 


TLARPTRLLD 


1501 


GIPAAQRVLD 


ASSESAEASE 


NASALRRELT 


ALPVRERTGA 


LLiDLVRKQVA 


1551 


AVLRYEPGQD 


VAPEKAFKDL 


GFDSIiWVEL 


RNRIiRAATGL 


RLPATLVYDY 


1601 


PTPRTLAAHL 


LDRVIiPDGGA 


AELPVAAHLD 


DLEAALTDLiP 


ADDPRRKGLV 


1651 


RRLQTLLWKQ 


PDAMGAAGPA 


DEEEQAAPED 


LSTASADDMF 


ALIDREWGTR 


1701 


* 











MonH, probable regulatory protein Length: 981 amino acids 
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1 1 SEPTEMBER 2000 



1 


VSGVERGVGS 


AGPVEQGDGL 


AGLVERAEAL 


AALRGAFDGS 


PGTGGSLWL 


51 


SGAVGTGKTA 


LLRAWADRIG 


ADADALVL.TA 


TACRAERDLP 


LGVLEQLVRS 


101 


PGLPPASAER 


ALAWWDEEAS 


ATPGKTDANG 


TSANGTDANG 


TGAGQTGAGQ 


151 


AGVGQTGVGG 


EPVLAASALR 


GLCEVLRDLL 


AERPVWAVD 


DAHHADAASL 


201 


QCKLSWRRL 


RSARLHVLFT 


EYAHQKAQNA 


LLSSEFLHEP 


ALRRIRLEPL 


251 


SKAGVEALLA 


RHLDERTAQD 


LTPWHGMSA 


GHPLLVRALA 


Ihdhraaggag 


301 


EAYGRAVLSF 


LYRHETPVTQ 


VARA I AALG A 


HAGPGQVGRLi 


LDVDAASVER 


351 


AVRQLTVAEV 


lhegrlchptT 


FAAAVLDGMP 


PEERRALHGR 


VADLLHEEGA 


401 


PATEVAAHLV 


AADRSDAPWA 


VPVFQEAAQLi 


ALDEDQVETG 


VDYLRAAHQR 


451 


CRGAAQRAAV 


VGALADAEWR 


LDPAKVLRHL 


PDPAAMAPQT 


DPAALAPHTD 


501 


PAPTAAPTAA 


PTPTPIPTTP 


PLPTHLLWHG 


RVEEGLDAIG 


TLTGPGPNPA 


551 


GAP PMNPADL 


DTPWIiWGAYL, 


YPGHVKERLG 


SGALSPQRST 


PPAVTPELQG 


601 


AGTLMNDLLH 


GGERDATEAA 


ERALNRYRLG 


PRTIAVQTAA 


LAALTYRDRP 


651 


HRAAAWCDGL 


VAQADERNSP 


TWRALFTAWR 


ALLHLRQGDP 


AAAEQRAETA 


701 


LALLGSKGWG 


AAIGLPLiAAA 


VQAKAALGDV 


DGAAALLERP 


VPQAVFQTRT 


751 


GL.HYLAARGR 


YHLATGCHYA 


AliCDFYACGT 


RMS S WGVDLP 


ALE P WRIiGAA 


801 


EAYLALGEGL 


LAJRQLVDGQL 


PliPTPDDGRT 


WGMTLRLRAA 


TSPAPARAEli 


851 


LDEAVAVLRE 


SGDTFELARA 


VADQAVAVRE 


GGEAERARLL 


ARKAELLARR 


901 


WGSAPAPATV 


PEPPERPGPA 


TPDAELTSAE 


RRVAELAAEG 


FTNREISRKL 


951 


CVTVSTVEQH 


LTRIYRKLDV 


RRLDLQAALG 


* 




MonCI, flavin-dependent epoxidase Length: 


496 amino acids 


l 


VTTTRPAHAV 


VLGASMAGTL 


AAHVLARHVD 


AVTWERDAL 


PEE PQHRKGV 


51 


PQARHAHLLVJ 


SNGARLIEEM 


LPGTTDRLLA 


AGARRLG F PE 


DLVTLTGQGW 


101 


QHRFPATQFA 


LVAS RPLLDL. 


TVRQQALGAD 


N I TVRQRTE A 


VELTGSGGGS 



SUBSTITUTE SHEET (RULE 26) 



H SEPTFMRFR 20DO 

151 GGRVTGVWR DLDSGRQEQL EADLVIDATG RGSRLKQWLA ALGVPALEED 

2 01 WDAGVAYAT RLFKAP PGAT THFPAVNIAA DDRVREPGRF GWYPIEGGR 
251 WLATLSCTRG AQLPTHEDEF IPFAENLNHP ILADLLRDAE PLTPVFGSRS 

3 01 GANRRLYPER LEQWPDGLLV IGDSLTAFNP 1 YGHGMS S AA RCATT I DREF 
351 ERSVQEGTGS ARAGTRALQK AIGAAVDDPW I LAATKD I D Y VNCRVSATDP 
401 RLI GVDTEQR L.RFAEAITAA SIRSPKASEI VTDVMSLNAP (JAELGSNRFL 

4 51.. MAMRADERLP ELTAPPFLPE ELAWGLDAA TISPTPTPTP TAAVRS 

MonBII, carbon-carbon double bond isomerase Length: 141 amino acids 

1 MPDEAARKQM AVDYAERINA GDI EGVLDLF TDDIVFEDPV GRP PMVGKDD 
51 LRRHL.ELAVS CGTHEVPDPP MTSMDDRFW TPTTVTVQRP RPMTFRIVGI 
101 VELDEHGLGR RVQAFWGVTD VTMDDPAGPA DTTH PEG I RA * 

MonBI, carbon-carbon double bond isomerase Length: 144 amino acids 

1 MNEFARKKRA LEHSRRINAG DLDAIIDIjYA PDAVLEDPVG LPPVTGHDAL 
51 RAHYEPLLAA HLREEAAEPV AGQDATHALI QISSVMDYLP VGPLYAERGW 
101 LKAPDAPGTA RIHRTAMLVI RMDASGLIRH LKSYWGTSDL, TVLG 

MonAVIII, polyketide synthase multi-enzyme MONS8, housing extension 
modules 11 and 12 Length: 3754 amino acids 



1 


MSNEEKLliDH 


LKWVTAELRQ 


ARQRLHDKES 


TEPVAIVGMA 


CRYPGGARSA 


51 


EDLWELVRDG 


GDAVAGFPDD 


RGWDLESLYH 


PDPEHPATSY 


VRDGAFLYDA 


101 


GHFDAEFFGI 


SPREATAMDP 


QQRLLiLETAW 


EAI EHAGMNP 


HAL KG SDTG V 


151 


FTGVSAHDYL 


TMSQTASDV 


EGYIGTGNLG 


SWSGRISYT 


VGLEGPAVTV 


201 


DTACSSSLVA 


IHLASQALRQ 


GECSLALAGG 


STVMATPGSF 


TEFSRQRGLA 


251 


PDGRCKPFAA 


AADGTGWGEG 


AGWALELLS 


EARRRGH KVL 


AVIRGSATNQ 


301 


DGTSNGLAAP 


NGPSQERVIR 


AALANARLSA 


EDI D AVE AHG 


TGTTLGDP I E 
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1 1 SEPTEMBER 2000 



351 


AQALIATYGQ 


GRPSDRPLWL 


GSVKSNIGHT 


QAAAGVAGVI 


KMVMAMRNGL 


401 


LPTSLHIDAP 


SPHVQWEQGS 


VRLLSEPVDW 


P AERTRRAG I 


SAFGISGTNA 


451 


HLILEEAPPE 


EDAPGPVAAE 


PGGWPWWS 


GRTPDALREQ 


ARRLGEFAAG 


501 


LABASVSEVG 


WSLATTRALF 


DQRAWVGRD 


LAQAGAS LEA 


LAAGEASADV 


551 


VAGVAGDVGP 


GPVLVFPGQG 


SQWVGMGAQL 


LDESPVFAAR 


IAECEQALSA 


601 


HVDWSLSDVL 


RGDGSELSRV 


EWQPVLWAV 


MVSLAAVWAD 


irtSlTPAAVIG 


651 


HSQGEMAAAC 


VAGALSLEDA 


ARIVAVRSDA 


LRQLQGHGDM 


ASLSTGAEQA 


701 


AELI GDRPGV 


WAAVNGPSS 


TVISGPPEHV 


AAWADAEAQ 


GLRARVI DVR 


751 


YASHGPQIDQ 


LHDLjLTDRLA 


DTQPTTTDVA 


FYS TVTAERli 


DDTTALDTAY 


801 


WVTNLRQPVR 


FADTI EALIiA 


DGYRLFIEAS 


PHPVLNLGIQ 


ET I EQQAGAA 


851 


GTAVTIPTLR 


RDHGDTTQLT 


RAAAHAFTAG 


APVDWRRWFP 


ADPTPRTVDL, 


901 


PTYAFQHKHY 


WVEPPAAVAA 


VGGGHDPVEA 


RVWQAIEDLD 


I DALAGS IiE I 


951 


EGQAESVGAL 


ESALPVLSAW 


RRRHREQSTV 


DSWRYQVTWK 


HLPDVPAPEL 



1001 


SGAWLLLVPA 


AHADHPAVliA 


TAQTLTAHGG 


EVRRHWDAR 


AMERTELAQE 


1051 


LRVLMDGAAF 


AGWNLLiAIjD 


EEPHPEHSAV 


PAGLAATTAL 


VQALADNGAD 


1101 


I AVRTLTQG A 


VSTSAGDALT 


HPVQAQVWGL 


GRVAALEYPR 


LWGGLVDLPA 


1151 


RIDHQTIiARL 


AAALiVP QDED 


QISIRPSGVH 


ARRIiAHAPAN 


TVGSGLGWRP 


1201 


DGTTLITGGT 


GG I GAVXiARW 


LARAGAPHLL 


LTSRRGPDAP 


GAQEliAAELT 


1251 


ELGAAVTVTA 


CDVGDREQVR 


RLIDDVPAEH 


PLTAVIHAAG 


VPNYIGLGDV 


1301 


SGAELDEVXjR 


P KALiAAHHLH 


ELTRELPLSA 


FVMFSSGAGV 


WGSGQQGAYG 


1351 


AANHFLiDAIiA 


EHRRAEGLPA 


TSIAWGPWAE 


AGMAADQAAL 


TFFSRFGLHP 


1401 


LSPELCVKAL 


QQALDAGETT 


LTVANFDWAQ 


FTSTFTAQRP 


SPLIiADLPEN 


1451 


RRASAPAAQQ 


EDATEASSLQ 


QELTEAKPAQ 


QRQLLLQHVR 


SQAAATLGHS 


1501 


DVDAVPATKP 


FQELGFDSLT 


AVE LRJNRLtNK 


STGliTLPTTV 


VFDHPTPDAL 
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1551 


TDVLiRAELiSG 


D AAA SAD PVR 


AAGAS RGAAD 


DEPIAIVGMA 


CRYPGDVRSA 


1601 


EELWDLVAAG 


KDAMGAFPDD 


RGWDLETLYD 


PDPESRGTSY 


VREGGFLYDA 


1651 


GDFDAGFFGI 


SPREAVAMDP 


QQRLLLETAW 


EAI ERAGLDR 


ETLKGSDAGV 


1701 


FTGLTIFDYL 


ALVGEQPTEV 


EGYIGTGNLG 


CVASGRVSYV 


LGLEGPAMTI 


1751 


DTGCSSSLVA 


I HQ AAHALRQ 


GEC S LALAGG 


ATVMATPGSF 


VEFSLQRGLA 


1801 


KDGRCKPFAA 


AADGTGWAEG 


VGLWLERLS 


EARRNGHNVL "AVIRGSAINQ 


1851 


DGTSNGLTAP 


NGQAQQRVIR 


QALANARLSA 


EDVDAVEAHG 


TGTMLGDPIE 


1901 


ASALVATYGK 


ERPADRPLWL 


GSIKSNIGHA 


QAS AGVAGV 1 


KMVMALRNEQ 


1951 


L PAS LH I DAP 


TPHVDWDGSG 


VRLLSEPVSW 


PRGERPRRAG 


VSAFG I SGTN 


2001 


AHLIL.EQAPD 


APEPVTAPAE 


DAAAPAGWP 


WWSARGE EA 


LRAQARLLAD 


2051 


RATADPRIiAS 


PLDVGWSbVK 


TRSVFENRAV 


WGKDRQTLL 


AGLRS LAAGE 


2101 


PSPDWEGAV 


QGASGAGPVL 


VFPGQGSQWV 


GMGAQLLDES 


PVFAARIAEC 


2151 


ERALSAHVDW 


S LiS AVLiRGDG 


SELSRVEWQ 


PVLWAVMVSL 


AS VWADYG I T 


2201 


PAAVIGHSQG 


EMAAACVAGA 


LSLEDAARIV 


AVRSDALRQL 


MGQGDMASLG 


2251 


AGSEQVAELI 


GDRPGVCVAA 


VWGPSSTVIS 


GP PEHVAAW 


ADAEARGLRA 


2301 


RVIDVGYASH 


GPQIDQLHDli 


LTERIiADIRP 


TTTDVAFYST 


VTAERLrDDTT 


2351 


TLDTDYWVTN 


LRQPVRFADT 


IEALLADGYR 


LFIEASPHPV 


LNXjGMEET I E 


•* \f J- 


PAnMPATWP 


TLRRDHGDAA 


QLTRAAAQAF 


GAGAEVDWTG 


WFPAVPLiPRV 


2451 


VDLPTYAFQR 


ERFWLEGRRG 


LAGDPAGLGL 


ASAGHPLLGA 


AVELADGGSH 


2501 


LLTGRISPRD 


QAWLAEHRVM 


DTVLLPGSAF 


VELALQAAVR 


AGCAELAELT 


2551 


LHTPLAFGDE 


GAGAVDVQW 


VGSVAEDGRR 


PVTVHSRPTG 


EGEEAW7TRH 


2601 


AAGWAP PGP 


DAGDAS FGGT 


WPPPGATPVG 


EQDPYGELiAS 


YGYDFGPGSQ 


2651 


GLVSAWRLGD 


DLFAEVALPE 


AESGRADRYQ 


VHPVLLDATL 


HALILDAVTS 


2701 


SADTDQVLLP 


FSWSGLRVHA 


PGAEKLRVRI 


ARTAPDQIxAL 


TAVDGGGGGE 
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2751 


PVLTLESLTV 


R P VAAHQ I AG 


ARAADRDALF 


RLVWMEVAAR 


AEETGGGAPR 


2801 


AAVLAPVESG 


PMGGTSAGAL. 


ADALSDALAA 


GPVWDTFGAL 


RDGVAAGGEA 


2851 


PDWLAVCAA 


PGAGAGAVAD 


ADGRGGD PAG 


YARLATVSLL 


SLLKEWVDDP 


2901 


AFAATRLVW 


TRGAVAARPG 


ETAGDLAGAS 


LWGLVRSAQA 


ENPGRLTLLD 


2951 


VDGLESSPAT 


LTGVLiASGE P 


ELALRDGRAY 


VPRLVRDDAS 


VRLVPPVGSL 


3001 


TWRLARCQEA 


GGGQQliS LVD 


APEAGRALEP 


HEVRVAVRAA 


'APGPLTAGQV 


3051 


EGAGWTEVG 


GEVGSVAVGD 


RVMGLFDAVG 


PVAVTDAALL 


MPVPAGWSWA 


3101 


QAAGSLGAYV 


SAYHVLADW 


APRGGETbLV 


GEETGSVGRA 


VLRLALAGRW 


3151 


RVEAVDGAST 


ADDSGAERAA 


DVTLRHEGAL 


WHRAGGRPD 


EGQAWPPEP 


3201 


GRVREILAEL 


TELTELAE IT 


ESAEPGLPAE 


RGDSRALTPL 


DITVWDIRQA 


3251 


PAAMAAPPSA 


GTTVFSLPPA 


FDPEGTVLtVT 


GGTGALGSI/T 


ARHLVERYGA 


3301 


RHXjLIiS SRRG 


ADAPGALELA 


ADLSALGARV 


TFAACDPGDR 


DEAAALLAAV 


3351 


PSDHPLTAVF 


HCAGTVNDAV 


VQNLTAEQVE 


EVMRVKADAA 


WHLHELTRDA 


3401 


DLSAFVLYSS 


VAGLLGGPGQ 


GS YTAANAFL 


DALARHRHDG 


GAAATSLAWG 


3451 


YWELASGMSG 


RLTDADRARH 


ARAGWGLGA 


DEGLALLDAA 


WAGGLPLYAP 


3501 


VRLDLARMRR 


QAQSHPAPAL 


LRDLVRGGSK 


SGGGAVSAGA 


AALLKSLGAM 


3551 


SDPEREEALL 


DLVCTH I AAV 


LGYDAATPVN 


ATQGLRELGF 


DSLTAVELRN 


3601 


RLSAATGLKL 


PATFVFDHPN 


PAELAAQLRQ 


ELAPRAADPL 


ADVLAEFERI 


3651 


EDSLLSVSSK 


DGSARAELAG 


RLRATLARLD 


APQDTAGEVA 


VATRTRI QDA 


3701 


SADEIFAFID 


RDLGRDGASG 


QGNGQPTGQG 


NGHGNGNGNG 


NGNGHGQAVE 


3751 


GQR* 











MoaAVII, polyketide synthase multi-enzyme MONS7, housing extension 
module 10 Length: 1642 amino acids 

1 MAHTEEKLLE YLKRVTADLR QTERRLQDVE SAGHEPVAVI GMACRLPGGV 
51 RSPEEFWELV STGGDAVAPL PGNRNWDLDS LYDPDPESTG TSYVREGGFV 
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101 


YDAGDFDPTF 


FGIGPTEAAA 


MAPQQRLALE 


TAWEAIERAG 


IDPLSLRSSD 


151 


TSTFIGCDGL 


D YALGAS EVP 


EGTAGYFTIG 


NSGSVTSGRV 


AYTLGLEGPA 


201 


VTVDTACSSS 


LVSLHLATQA 


LRTQECSLALi 


AGGTYVMSSP 


APLIGFSELR 


251 


GLAPDGRCKP 


FSASSDGMGM 


AEGTGWLLE 


RLSDARRKGH 


KVLAVIRGSA 


301 


INQDGASNGL 


TAPNG PAQER 


VIRAALANAR 


LAP EDI DAVE 


AHGTGTTLGD 


351 


PIEAGALISA 


YGRERPEDRP 


LWVGAVKSNI 


GHTQ I AAGVA 


Ttvikmvlalr 


401 


HDLLPAILHV 


DAPSPHVEWD 


GSGLRLLTDP 


VKWPRGERPR 


RAG VS S FG F S 


451 


GTNAHLILEE 


APPEEEDVPG* 


SVAEEPGGW 


PWWSGRTPD 


ALRAQARRLG 


501 


EFAAGPADAS 


AADVGWSLTT 


TRSVFEHRAV 


WGRDRDALT 


AGLGALAAGE 


551 


ASAGWAGVA 


GDVGPGPVLV 


FPGQGSQWVG 


MGAQLLDESP 


VFAARIAECE 


601 


RALSAYVDVJS 


LSAVLRGDGS 


ELSRVEWQP 


VLWAVMVSLA 


AVWADYGVTP 


651 


AAVIGHSQGE 


MAAACVAGAli 


SLEDAARIVA 


VRS DALRRLQ 


GHGDMASLST 


701 


GAEQAAEL I G 


DRPGWVAAV 


NGPSSTVISG 


PPEHVAAWA 


DAEARGLRAR 


751 


VTDVGYASHG 


PQ I DQLiHDXAj 


TERLADIRPA 


NTDVAFYSTV 


TAERL.TDTTA 


801 


L1DTDYWVTNL1 


RQPVRFADTI 


EALLADGYRL 


FIEASAHPVL 


GLGMEETIEQ 


851 


ADIPATWPT 


LRRDHGDTTQ 


LTRAAAHAFT 


AGAPVDWRRW 


FPADPTPRTV 


901 


DLPTYAFQHQ 


HYWLERSASA 


SGAVSGEQSA 


AEAQLWHAVE 


ELDLGLLAET 


951 


LGSEEGSEEA 


VRALEPALPV 


LKGWRRRHQD 


Q AT IDS WR YR 


VTWKQRSDGP 



1001 APELGGDWL.L FVPADKAEHP AVRATAEALS EHGAAAVRLH PVETGRAGRO 

1051 ELAAVDTAGL AG I VTSTLLiALD EEPHPEHPAV PAGLAATTAL LQALGDNGTT 

1101 APLiHTVTQGA VSTGATDPLT HPLQAHVWGL GRVAALEHPR bWAGLVDLPA 

1151 RIDRHTLPRL AAALLPQDDE DQTAVRPTGI HHRRLTHAVG SIQNPVHSEA 

12 01 TWRPRGTTL I TGGTGG I GAV LARWLARQGA PRLHLTSRRG PDAPGARELA 

1251 AELDGLGTAV TITACDVSDP RQLSGLIDDM PAEH PLTAV I HAAGMTDLTA 
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1301 IGDLTTARLG EVLGSKSDAA WNLHELTRDL DLSAFVMFSS GAGVWGSGQQ 
1351 GAYGAANHFL DALAEHRRAQ GLPATSIAWG PWAEAGMSAD PESLTYFKRF 
1401 GLLPIAPDLC VKAXjHQAVDA GDATLTVANF DWAKFTPTFT AQRPSPFLDD 
14 51 LPENQREAEQ TGTAAETSAF REELAKT PAS QRLGFL.VQQV RTYAAATLGR 
1501 TVEDIPAAKP FQEJjGFDSLT AVQLRNQLNT TTGLSLPATV IFDHPTPEAL 

i 

1551 ATHLRGQLGD gaevagegdv laaldkwdta fgaaevdeaa^rrivgrlqv 

1601 L.VSKWSPAQD GPEGTDSAHA DLEAASADDI FDLISSEFGK S* 

MonD, cytochrome P450 hydroxylase Length: 431 amino acids 

1 VGLTVGPDNA KRGIVPITDS KPAATFPDLV DPSFWARPHA ERVALFEEMR 

51 GLPRPAFIRQ NMPGVPWTFG YHALVKYADI VEVSRRPQDF SSNGATTIIG 

101 LPPELDEYYG SMINMDNPEH SRLRRIVSRS FGRNMIPEFE AVATRTARRI 

151 IDELIARGPG DFIRPVAAEM PIAVLSDMMG IPAEDHDFLF DRSNTIVGPL 

201 DPDYVPDRAD SERAVI EASR ELGDYIAGIiR AERLAAPGND L I TKLVQVQA 

2 51 DGEQLTRQEL, VSFFILLVIA GMETTRNAIS HALVLLTEHP EQKQLLLSDF 

3 01 DTHAPNAVEE ILRVSTPINW MRRVATRDCD MNGHRFRRGD RIFLFYWSGN 
351 RDESVFPDPY RFD I TRGTNA HVTFGAVGPH VCLiGAHLARM EITVLYRELL 
401 AALPQIHAVG QPRRLDSSFI EGIKHLHCAF * 

MonRI, probable activator protein Length: 268 amino acids 

1 VRYEMLiGPLR IKDGNDYATI NAQKVEIVLT VLLIRADRW SLEQLMREIW 

51 GEDLPRRATA GLHVYISQLR KFLKVPGSAG NPVETRAPGY VLHKRDDDQI 

101 DAQIFPELVD VGRSLLREKR FDEAASCFGQ ALALWRGPIL GQGGNGPGTN 

151 GPIIDGFSTW LTEIRLECQE MLVECQLQLG RHREAVGMLY ALTAENPMCE 

2 01 AFYRQLMLAL YRSERQADAIi KVYQSVRKTL NDELGLEPGR PLQELQRAIL 

251 AGDMHLMSPP PLALSGR* 
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MonAX, thioesterase Length: 278 amino acids 

1 LSAFLAKGKI LSAFPPPDMS DPWIRRFRPR PEAWRLVCF PHAGGSASYY 
51 HPLAQSPTLP TDSEVLAVQY PGRQDRRRER LLDDIGELAD LITDALGPFD 
101 DRPLAFFGHS MGAVLAYEVA QRLRERTGKQ PCRLFVSGRR APSRFRRGTV 
151 HLLDDTELAA ELRRAGGTDP RFLDDEELLA E I I PWRNDY RAVSLYRWNP 
2 01 SPPLSCPITA LVGDRDPQAP LDEVEAWQQH TEGPFDLKVF AGGHFYLNTH 
251 - QQGVTEVISK ALADSAQQRA TARGNAR* 

ORF29, a homologue of CapK involved in cell wall biosynthesis Length: 428 
amino acids 



1 


LADLVAHARS 


ASPYYRELYH 


GLPERIEDPT 


LLPVTDKKQL 


MDHFDDWPTD 




51 


RDITFEKVRA 


FTDDPELIGR 


RFLGRYLVAT 


TSGTSGRRGL 


FVLDDRYMNV 




101 


SSAVSSRVLA 


SWLGPLGIAR 


AWHGGRFAQ 


LVATEGHYVG 


FAGYSRLRQD 




151 


GEARSKLVRA 


FSVHEPMSRL 


VAELNEYRPA 


FVIGYASTIM 


LFTAEQEAGR 


201 


LiHIDPVLVEP 


AGETMTESDT 


DRIAAAFGAK 


VRTMYSATEC 


TYLSHGCAEG 




251 


WYHVNDDWAV 


IjE PVDADHRP 


TPPGEFSHTT 


Li I SNLANRVQ 


PFLRYDLGDS , 




301 


VMLRPDPCPC 


GTPSPAIRVQ 


GRSGDILTFP 


SGRGDDVSLA 


PLAFSSLFDR 




351 


MPGVELFQIE 


QTAPSTLRVR 


WQAPGADAD 


HVWQRAHDGL 


THLLADNKLD 


A* 

-A 


401 


NVTVERGEEP 


PRQASGGKYR 


TIIP1AA* 








LipB, lipase B Length: 338 amino acids 








i 


VKVPVEVTVR 


LSSWLGGLVA 


AVLAATVLPA 


SAASAADVSS 


PPLEIPAAEL 




51 


AKALHCGTEL 


GDLiRDAGDKP 


TVLFVPGTGL 


KGEENYAWNY 


MAELKKKGYQ 




101 


SCWVDSPGRG 


LRDMQESVEY 


WYATRAIQE 


ATGRKVDXjVG 


HSQGGLLTAW 




151 


AJLiRFWPDLPG 


KVDDMVTLiGS 


PFQGTRLASP 


CRPIAEVAGC 


PASVLQFARD 




201 


S1STWS KALGAD 


GTPMPAGPSY 


TTIYSYADES 


WADGEAPSL 


PGAHRIGVQD 
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251 ICPGRPWPTH IAMWDQVSY DLVADAIEHP GPADTSRIDR AHCAKPVMPL 
3 01 NSQEAVDALP GLLNTFPIELL IHSQPWVDEE PPLRPYAR 

ORF31, putative ion pump Length: 309 amino acids 



51 LADAAHSLTD AVGVS LiALGA I TLAQRAPTP RRTFGFCRVE IFSAVLNALL, 

101 LWIFAWVLW SAIGRFSEPV EVKGGLMFW ALGGLAANLV GLWKLRDAKE 

151" KSLNLRGAYL EVLGDALGSV AVIVGGLVIL LTGWQAADPI ASIVIGLLIV 

201 PRAYGLLRDS LHVLLEATPQ DVDLGEVRRH LL E E RG WAV HDLHGWTVTS 

251 GMPVLTAHW VTEEALASGY GELLGRL.QRC VGGHFDVAHS TIQLEPEGHV 

3 01 EEDGALHT* 

ORF32, hypothetical membrane protein Length: 364 amino acids 

1 MTRALTLHDW I VAG I A WAG WAGLI1L.RAL LRWLGERASK TRWSGDDVIV 
51 DALRTLVPCA A I TAGLAAAA GALPLTPRTG RNVTMTLTAL L I LAATLTAA 
101 RTVTGLVKAV AQSRSGVAGS ATIFVNITRV WLAMGFLIV LQTLGISIAP 
151 LLTALGVGGL AVALALQDTL ANLFAGVHIL AAKTVQPGDY IQLSSGEEGY 
201 WDINWRNTT VRQLSNNLVI IPNAKLAGTN MTMYSRPEQE LSIMVQVGVS 
251 YDSDLEQVEK VTTEWBEVM AE1TGAVPDH EAAIRFHTFG DSRISFTVIL 
3 01 GVGEFSDQYR I KHEF I KRLH QRYRAEG I RV PAPVRTVRVQ QGELPPPLGI 
3 51 PHQRDTSTQA RLH* 

AmtA, glycine amidinotransferase (partial coding sequence) 
Length: 131 amino acids 

1 MSPVNSHWEW DPLEEIIVGR LEGATIPSSH PWACNIPTW AARLQGLAAG 
51 FEYPQRLIEP AQQELDQF I A LLQSLDVTVR RPAAVDHKHR FGTPDWQSRG 
101 FCNSCPRDSM LWGDEIIET PMAWPCRCFE T 



1 



MGHDHGPSAG AAGGTLSGTY RKRLLWTIGI SGSITVIQW 



GALLSGSLAL 
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CLAIMS : 

1. A DNA sequence which is (a) at least part of 
the sequence set out in the appended sequence listing; or 
5 (b) a variant of a sequence (a) which encodes a 

polypeptide which is at least 80%, preferably at least 
90%, identical with tjhe corresponding peptide as set out 
in table II; provided that it is not a sequence encoding 
all or part of the polypeptide consisting of amino ELcLds 
10 1-920 encoded by mon AI as set out in table II. 



2 . A DNA sequence according to claim 1 comprising 
the complete monensin gene cluster or a variant thereof. 

15 3. A DNA sequence encoding at least part of at least 

one polypeptide which is necessary for the biosynthesis 
of monensin, and which is encoded by DNA included in the 
appended sequence listing or an allele, mutation or other 
variant thereof; provided that said polypeptide is not 

20 all or part of amino acids 1-920 encoded by mon AI as set 

out in table II. 

4 . A DNA sequence according to claim 3 which 
comprises at least part of one or more of the following 
25 genes: mon BI , mon BII, mon CI f mon CII , mon H, mon RI , 

mon RII, mon T, mon AIX and mon AX. 
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5. A DNA sequence according to claim 4 comprising 
all of the genes listed therein or an allele, mutation or 
other variant thereof. 

6. A DNA sequence according to claim 3 encoding at 
least part of one or more of the polypeptides set out 
below, said polypeptide having the amino acid sequence* as 
set out in the appended sequence data or being a variant 
thereof having the specified activity: 



peptide 


activity 






mon 


CI I 


epoxyhydrolase /cyclase 


mon 


E 


S-adenosylrnethionine-dependent methyl transferase 


mon 


T 


monensin resistance 


gene 


mon 


RII 


repressor protein 




mon 


AIX 


thioesterase 




mon 


AI 


polyketide 


synthase 


mult i enzyme 


mon 


All 


polyketide 


synthase 


multienzyme 


mon 


AIII 


polyketide 


synthase 


multi enzyme 


mon 


AIV 


polyketide 


synthase 


mult ienzyme 


mon 


AV 


polyketide 


synthase 


multienzyme 


mon 


AVI 


polyketide 


synthase 


multienzyme 


mon 


AVII 


polyketide 


synthase 


multienzyme 


mon 


AVIII 


polyketide 


synthase 


multienzyme 


mon 


H 


regulatory 


protein 




mon 


CI 


flavin-dependent epoxidase 


mon 


BII 


carbon-carbon double bond isomerase 
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mon BJ carbon-carbon double bond isomerase 
mon D cytochrome P450 hydroxylase 

mon RI activator protein 
mon AX thioesterase 

5 

7 . A DNA sequence according to claim 6 encoding a 
single enzyme activity of a multienzyme encoded by any of 
mon Al-mon AVIII or a variant or part thereof . 

10 8. A DNA sequence according to any preceding claim 

encoding any one or more of the domains as set out in 
Table I or a variant or part thereof. 

9. A DNA sequence .according to any preceding claim 
15 which has a length of at least 30, preferably at least 60, 

bases. 

10. A recombinant cloning or expression vector 
comprising a DNA sequence according to any preceding 

20 claim. 

11. A transformant host cell which has been 
transformed to contain a DNA sequence according to any of 
claims 1-9 and which is capable of expressing a 

25 corresponding polypeptide. 
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12. A hybridisation probe which is a DNA sequence 
according to any of claims 1-9. 

13. Use of a probe according to claim 12 to detect a 
5 PKS cluster, optionally followed by isolation of the 

detected cluster. 

14. Use of a probe according to claim 12 which 
encodes at least part of a polypeptide having a known 

10 function to. detect genes encoding polypeptides having 

analogous function. 

15. Use according to claim 14 wherein the 
polypeptide of known function is AT of module 5 or the 

15 regulatory protein encoded by mon RI. 

16. A hybridization probe comprising a 
polynucleotide which binds specifically to a region of the 
monensin gene cluster selected from mon BI f mon BII, mon 

20 CI, mon CII , mon H, mon RI , mon RII, mon T, mon AIX and 

mon AX. 

17. Use of a probe according to claim 16 in a method 
of detecting the presence of a gene cluster which governs 

25 the synthesis of a polyether, and optionally isolating a 

gene cluster detected thereby. 
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18. Use of a probe according to claim 12 which 
comprise a polynucleotide which binds specifically to a 
gene responsible for levels of activity of the monensin 
gene cluster, in a method of detecting an analogous gene 
5 in a gene cluster for biosynthesis of another polyketide, 

optionally followed by a step of manipulating the gene 
- detected thereby to alter the level of expression of said 
other polyketide. 

10 19. Use according to claim 18 wherein the gene is a 

regulatory gene, resistance gene or thioesterase gene. 

20. Use of the mon RI gene or variant and a monensin 
promoter to control expression of a heterologous gene in 

15 S. cinnamonensis . 

21. Use of a portion of the monensin gene cluster 
encoding a polypeptide having chain terminating activity, 
preferably comprising at least one of mon AIX and mon AX 

20 or a mutant, allele or other variant thereof encoding a 

polypeptide having chain terminating activity, to effect 
chain release of a peptide other than monensin. 

22. Use of a portion of the monensin gene cluster 
25 encoding a polypeptide having carbon-carbon double bond 

isomerase activity, preferably comprising at least one of 
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mon BI and jnon BXT or a mutant, allele or other variant 
thereof having isomerase activity to provide a desired 
stereochemical outcome in the synthesis of a polyketide 
other than monensin. 

5 

23. A polypeptide encoded by a portion of the 
monensin gene cluster^ preferably comprising at least one 
of mon BI and mon BIJ or a mutant, allele or other variant 
thereof, having carbon-carbon double bond isomerase 

10 activity, or at least one of mon AIX and mon AX or a 

mutant, allele or other variant thereof having chain 
terminating activity. 

24. An epoxidase enzyme encoded by mon CI or a 

15 derivative or variant thereof having epoxidase activity. 

25. A cyclase enzyme encoded by mon CXI or a 
derivative or variant thereof having cyclase activity. 



20 26. Use of a portion of the monensin gene cluster 

encoding a peptide having epoxidase or cyclase activity, 
preferably comprising mon CI or mon CII or a mutant, 
allele or other variant thereof encoding a polypeptide 
having epoxidase or cyclase activity to provide a said 

25 activity in the biosynthesis of a polypeptide other than 

monensin. 
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27. A process for producing a polyketide containing 
a desired starter unit comprising providing a PKS gene 
having a loading module and a plurality of extension 
modules, wherein the loading module includes a KS q domain 

5 derived from a KS domain of a monensin extension module. 

28. A process according to claim 27 wherein the KS q 
domain is derived from KS of module 5 of monensin. 

XO 29. A process according to claim 27' or claim 28 

wherein the starter unit also includes an AT q domain 
derived from an AT domain which is naturally associated 
with the KS domain. 

15 30. A DNA sequence comprising DNA encoding at least 

one PKS loading module and a plurality of PKS extension 
modules, and which can be expressed to produce a 
polyketide; wherein at least one of said modules or at 
least one domain thereof is a monensin module or domain or 

20 a variant thereof and is contiguous to a further one of 

said modules or a domain to which it is not naturally 
contiguous; provided that the sequence is not an ery 
loading module, the first and second extension modules of 
the erv PKS and the erv chain-terminating thioesterase in 

25 which the DNA encoding AT of the first extension module 

has been substituted by DNA encoding an ethyl malonyl-CoA 
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AT from the monensin gene clu 

31. A DNA sequence acao 
said further module or domain 

5 domain or variant thereof. 

32. A DNA sequence acco 
said further module or domain 
PKS of a polyketide other tha 

10 thereof. 



rding to claim 30 wherein 
is also a monensin module or 

rding to claim 30 wherein 

is a module or domain of a 
n monensin or a variant 



33. A DNA sequence according to claim 30, 31 or 32 
wherein said loading module is adapted to load a starter 
unit other than a starter unit normally received by the 

15 adjacent extension module. 

34. A DNA sequence according to claim 33 wherein 
said loading module is derived from a monensin extension 
module or variant thereof. 

20 

35. A polyketide synthase encoded by the DNA 
sequence of any of claims 30-34. 

36. A polyketide compound as produced by a synthase 
25 according to claim 35. 




10 



15 
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37. A vector containing a DNA sequence of any of 
claims 30-34. 

38. A transformant cell transformed to contain a DNA 
sequence of any of claims 30-34. 

39. A method of_ producing S. cinnamonensis capahlG 
of enhanced levels of production of monensin comprising 
engineering it to overexpress the nton RI gene. 

40. A method according to claim 39 wherein said 
engineering comprises introducing at least one additional 
copy of the mon RI gene as shown in the appended sequence 
data or a variant thereof. 

41. S. cinnamonensis containing multiple copies of 
the mon RI gene as shown in the appended sequence data 
and/or variant (s) thereof. 



20 42. A method of producing monensin comprising 

culturing the organism of claim 41 and/or an organism 
produced by the method of claim 39 or claim 40. 

43. A process for expressing a gene heterologous to 
25 5. cinnamonensis comprising transforming S. cinnamonensis 

with DNA encoding a heterologous gene and expressing said 
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gene under control of the activator gene mon RI or 
actII/orf4. 

44. A process according to claim 4 3 wherein said 
5 heterologous gene is a PKS gene. 

45. 13-Propyl erythromycin A. 
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POLYKETI PES AND THEIR SYNTHESIS 
ABSTRACT 

The complete sequence of the gene cluster for the 
5 monensin type I polyketide synthase, from S. 

cinnamonensis, is provided. Thus variant polyketides 
containing monensin-derived elements can be genetically 
engineered. Furthermore there are novel features , e.g. a 
regulatory protein mon RI, which are of wide utility. 
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SEQUENCE L. I STING 

1 GATCAGCGCG GTGGCGTCGT CGGCGTCCAG CTCGTTCTGC GTGGCGGACG 

51 GCAGCGCGAT GTCGGCAGGC , ACCTCCCAGA CCCGGCGGCC CGGCACGAAG 

101 CGGGC CGAGG CGCCGCGGCG CTGGGCGTAG GTGTCCACGC GGGCGCGTTC 

151 GACCTCCTTG ACCTGCTTGA GGAGGTCCAG GTCGATGCCC TTCTCGTCGA 

2 01 CGACGTAACC GGAGGAGTCC GAACACGTCA CGGCGTTGGC GCCCAGGGCG 
251 GCGAGCTTCT GGATGGTGTA GATGGCGACG TTCCCGGAGC CGGACACGAC 
301 CGCCGTCCGG CCTTCGAGGG TCTCGCCGCG CTCACGCAGC ATCGCCGCCG 

3 51 CGAAGAGGAC GTTGCCGTAG CCGGTCGCCT CCGGACGGAT CAGGGAGCCG 

4 01 CCCCAGTTGC GGCCCTTGCC GGTGAGGACG CCCGCCTCCC AGCGGTTGGT 

4 51 GATGCGCCGG TACTGACCGA ACAGATAGCC GATCTCCCGG CCGCCGACGC 

5 01 CGATGTCGCC CGCGGGCACG TCCGTGTGTT CGCCGATGTG CCGGTACAGC 
551 TCCGTCATGA ACGACTGGCA GAAACGCATG ACTTCCGCGT CGCTGCGGCC 
601 GCGCGGGTCG AAGTCGCTGC CGCCCTTGCC GCCGCCGATG CCGAGGCCCG 
651 TCAGCGCGTT CTTGAAGATC TGCTCGAAGC CCAGGAACTT GATGACGCCG 
701 AGGTTCACCG ACGGGTGGAA GCGCAGGCCG CCCTTGTA.CG GGCCGAGGGC 
751 GCTGTTGAAC TCCACCCGGA AGCCGCGGTT GACCCGCACG CGACCGTGGT 
801 CGTCCTGCCA CGGCACCCGG AAGACGATCT GGCGCTCCGG TTCGCACAGG 
851 CGCTCGATCA GGCCGGCTTC GGCGTACTCG GGGCGAGCCG CGATGACCGG 
901 CGCCAGGGTC TCGAGGACCT CGCGGGCGGC CTGGTGGAAC TCCGGCTGGG 
951 CCGGGTTGCG GTGTTCGATC TCGGTGAGCA GCTGGGAGAG TGCTGTCTTC 

1001 TGCGAGAGAG CTGTCTTCGT GTCGGGTCGC GTGGTCAAAG GAGCCCTTTC 

1051 TGGCACGGCC GGCGTAGGCG CTCGGCGCCG TTGCCGTGCG CAGGGAGACG 

1101 CTCGAGCCGC AAGTATGACG CGCATGTAAA CACAGCGACC AGCCCCCCGG 

1151 TCCAGGGAGT G AC C ACCATG CG AG AC CGGG CCACCGGTAG GGCCACCGGT 

12 01 CCGGCCTGCG GACCCCGTGT CACTTCCGGC TCGCGGCCAG GGGTGCCGCC 
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12 51 CGGCGGACCG AATCGGCGGA GGCGGCCAGC AGTGGCATGC GGACGGCCGG 

13 01 GCTGGGAATG CGGTTCTGGG CGTGCAGCAC TCCCTTGATC ACCGTCGGGT 

13 51 TCGGTTCGGT GAAGAGGGCG GCGGAAAGGC GGGCGAGGTC GGCTCCGAGA 

14 01 GCGCGGGCGG GTGCGGCGGA GCCGCGTCGC CACAGCGCGA TCATCTCGGC 
14 51 GTAGTGGGCG GTACGCAGAT TGGCCGACGC CACGATTCCG CCGTGGGCGC 
1501 CCGCAGCGAC CAGCGG CG AG AGGACGATGT CGTCACCGCC GAGCACGGCG 
1551 AAGCCGGGCA GGGGCGAGTC GAG C AACTCC ATGGTGG TCG GGTCGATCGA 
1601 GCCGGTCGCG TGCTTGATGC CGACGACCTC CGGCAGGCGG CCGAGTGCGG 
1651 TG ATCG TGCC CGCGCCGAGC GTCTGCCCGG TGCGGTAAGG GATGTCGTAC 

17 01 ACGACCAGGG GGAGGCCGCC GTGCTCGGCC AGCGCGGCGA AATGAG CCAG 
1751 GGTCCCCGCT TCCCCGGGGC GGATGTAGGG CGGCGCGGGG ACCAGCGCGG 

18 01 CGGCGACGTC ACCCCGGGCC GCCAGCTCTC GCAGGGCCGT GATGGCGGTG 
1851 GCGGTGTCGT TGGTGCCCAC CCCGACGATG AGCGGTGCCC CGTGTGC CCG 
1901 GCACGCGGCC GAGCAGACGC GGATCACCGT CTCTCTCTCC TCGGCGGTCA 
1951 GTGTGGCGGC CTCGGCGGTC GTACCGAGGG CGACGAGCCC GGAGGCGCCG 
20 01 GCCGACAGCG CCTCGTCGGC GAGTCGGGCC AGCGCCTCGG GGGCCAGGCG 
2 051 CAGATCGTCG GTGAACGGAG TTACCAGGGG GACGTACAGG CCGTTGAAGA 
2101 GCGGTTCGGT GGTCGGTTCG AGGCTCGATG CGAGGGTCAT GCTCTTACCC 
2151 TGGCCCACGC CACTCGGTAG ATCCATTTCA GATTCCTGCC GTCACACCTA 

22 01 AG CTG AACTT ATGCTCGATG TCCGTCGCCT CCATCTGCTC CGCGAACTCG 
2 251 AC CGGCGGGG CACCATCGCC GCCGTGGCCG AAGCGCTGAC CTTCACCGCG 
2 3 01 TCCGCCGTCT CCCAGCAGCT CGGCGTGCTG GAGAGGGAGG CGGG CGTGCC 

23 51 GCTGTTGGAA CGCAGCGGCA GGCGCGTGGT CCTCACGCCC GCAGGACGCT 

24 01 CCCTCGTCGC ACACGCCGAC GCGGTGCTGA ACCGTCTCGA ACAGGCGGTC 
2451 GCCGAGCTGG OGGGCG CACG GG ACGG CATC GGCGGGCCGC TGCGCATCGG 

25 01 GACGTTCCCT TCCGGCGGCC ACACCATCGT CCCCGGCGCG CTGGCCGAAC 
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2 551 TGGCCTCTCG TCACCCCGCG TTGGAGCCGA TGGTGCGGGA GATCGACTCC 

2601 GCGCGCGTCT CCGACGGTCT GCGGGCCGGT GAGCTGGACG TGGCCCTCGT 

2651 ACACGACTAC GACTTCGTAC CCGCGACGCC GGACACGACC GTGGACGAGG 

2 701 TGCCTCTGCT CGAAGAGCCG ATGTACCTCG TCACCCATGC CGCGGACACT 

2 751 GCCACGGACT CCGGCTCCGG GAG CACACTG GCAGCGCTGC TCGGGCCCTG 

2801 TGCCGAGGTT CCGTGGATCA CGGCGCGGGA CGGCACGACC GGTCACGCGA 

28 51 TGGCTGTACG CGCCTGTCAG GCCG CCGGGT TCCAGCCCAG GATCCGCCAC 

2 901 CAGGTCAACG ACTTCCGCAC GGTGCTGGCT CTGGTCGCCG CCGGGCAGGG 

2 951 GGCCGGGTTC GTGCCGCGGA TGGCCGCCGA GCCGAGCCCC GCGGGCGTGG 

3 001 TGCTCACGAA GCTGCCGCTG TTCCGTCGCT CGAAGGTCGC GTTCCGTGCG 
3051 GGCGGCGGTG CCCATCCGGC GATCGCCGCT TTCGTGGCCG CGGCGACGAC 
3101 GGCGGTCGAA CGCATGGCGG GTTCACGAGG CCCGGCCGGC GGCTCTGAGT 
3151 GAACCGGCCG ACCGTGGGAA TGTGTGTGCC CTGGGCCGCA CCATTCGTGG 
3201 CCTGGTGACG TCCTGGCGAC GTCCTGACGT CCTGATGTCC GAACGAGAAG 
32 51 GCGATTTTCC GCGATGGCCG ATGACGCGTA CCTGTTCCTC CTCCCCGACC 
3 3 01 GGCACCCCCG ACTGGGAGCG GCCCTCGCCG CCGTCGGTGC CTTGGAATGC 
3 351 ACGGAAACCC CTGCGGTGCA CGCCTGGTTG CAGGCTCATG AGGCCTCCGT 
34 01 GTCCTCGGAA CAGGTCAGGA TTCTGCCCGC CGATGCCGAG ACACTCATCC 
3 4 51 CGAAGGACGC CGAGCGGCTG CCGGTGCCGT TGAGCGAGGA GGAGGCGCTC 
3 501 AAGGTCGAGC AGGAGTGCGC GCCCCAGACC GTCACGGACA TGGAGAGCGA 
3 551 ACTG CTCGCG TTCCGGGAGA CGACCCAGGA CTGGCAGGCC CTCGTGCACC 
3 601 GGGCCCTGAC CGCGGG CATC CCCGCGCAGC GCATCGCCCG GCTGACCGGA 
3 651 CTCGACCCGG AGGAGATCGG CCGCCTGTAG GCGCTAGCGG CCGCCCAGTG 
37 01 CGGACACCAG GATGGCGACC GTGACGGTGT TGAAGACGAA GGCGATGACC 
37 51 GTGTTCGCCG CCACGGTCCG TCG CATGTCG CGTGAGGTGA CGTCGACATC 
3 8 01 GGTGGTGCCG AACGTCGTCA TCGCGGCCAG GGCGAAATAG ACGTAGTCGG 
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3 851 CCCAGGCGGG ACTCCGCTCC CCGGGGAATT CCAGTGCCCG CTCGTTCTCC 

3 901 ACGAGGTTGT CGGCCTGGAA GGTGACGGCG AAGGCCACGA CCACGCAGAT 

3 951 CCAGGCGGCG A CG AC C AG GG CGAGCGCCAC CAGGGTGCGG GGGAGCGCGG 

4 001 AGAAGGTGGT GCTGAGGTGG CCGGGAAGCC ACAGCACCGC CACCACCAGC 
4 051 GCGGCAGCCG CGATGAAGAG CGAACCCCCG GGGCCGGGCG CGGTTC CG AG 
4101 GACGTAACGC TGCAGGAATG TGCCGCGGGC TTCGCGCCGC GCCCAGGAGC 
4151 GGACCTGCTC CGGAGCGACG CTCACGAAGA CGGTCATGGT GATGGCGAGG 
4 201 TAGGGC AG C A GGTAGGCGAA G AAG ACG AG C ACGCCGACAT CCGCTGCCGA 

42 51 AATCCGCACC ACGGCGTCGA TGGGGAGGAC CACTGCCGCG CACGCCGCGA 

43 01 CGGCGAGGCT CACCGCCGAC CGGCGCCGTT CGGAAAGCCA GCGATGCACG 
4 351 GACGAGCCTC TCTGGTCGGG CGTCGGGCCT CGTGTGATCG TGACCGGCTC 

44 01 CGCGCCCGCC GAAAGCGCGG TGCGATCTCC TGCCCTCGAA CGAGCGAAAC 
4451 GCTTGCGCCG GAAAGCCTCC CTGCTGATGC CGACGGCGGC GG CAGTGGCT 
4 501 . GCGGATGCGG ATCGTGCGCT GTGCCCTGAC CCTGGATGGG GGGAGG AACG 
4 551 CAGAGAGGCA GGTGCGCCCA TGACGGTCAT GGACAAGCTC AAGCAGATGC 
4 601 TCAAGGGGCA CGAGGACAAG GCCGG CCAGG GAATCGACAA GGCGGGCGAC 
4651 TTCGTCGACG GGAAGACGCA GGGCAAG TAC AGCGGTCAAG TCGACACGGC 
4 701 C C AGG AC AAG CTCCGGGACC AGTTCGGCTC GGATCAGCAG GAGCCTCCGC 
4 751 AGAGGTAGGC AGCGTCAGGG CGGAATCGGT CCGGGCGACC GCTGACCGCT 
4 801 GATGCAGATG CCGCAGACGT CGGCCCCGCA CTCCTCCGGG TAAATCGGAG 
4 851 CGTAGGCGGG GCCGACGTGT GCGCGTGCGG CCTCGTCTCT GCCGCCCCTC 
4 901 TCCGCCCCGT CTCTGGCCCC TTGGTGCCAG TCTGACGGGA AAATGG CACC 

4 951 ACTTGGTGCC ACGCATGTGC CATGATGGCG TGATCGAGAG CGCGCTGCCC 

5 001 CGACTCGCGG GCAGGAAGGG CGCGTTCCGC GGAGTCGGCC GTCGGAGGGG 
5 051 TTGCATCATG GGGACAGCAC AGAGCCAGGA GCAGGCCGCC GCGCCCGGTG 
5101 CCTGCGCCGC CTTCGTCCGC TTCGTGCTCT GCGGTGGCGG AGTGGGC CTC 
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5151 GCCTCCAGCT TCGCCGTGGT CGCCCTCGCC TCCTGGGTTC CCTGGGCGCT 

52 01 GGCCAACGCC CTGGTCGCCG TGGTCTCCAC CGTCGTCGCC ACCGAGCTCC 

52 51 ACGCCCGCTT CACCTTCGGT GCGGGCGGGC GCGCGACCTG GCGGCAGCAC 
5 3 01 GCGCAGTCGG CCGGGTCCGC GGCGGCCGCG TACGCGGTGA CCTGCGTGGC 

53 51 GATGTTCGTC CTGCAGCAGC TGGTGGCGGC GCCCGGCGCG GTGCTCGAGC 

54 01 AGGTCGTGTA CCTGTCGGCC TCCGCGCTCG CCGGTGTCGC GCGGTTC(?TG 
54 51 GTGCTGCGCC TCGTCGTCTT CGCCCGGAAC CGCTCGCTGC CCGCCGCGGC 
5501 CGCCGTGCGC ACCGCGCGTC CCGTGCGTCG CGTGCCGGCG CCCGTGCCCG 
5551 CGACCGTGGC CCACGCCGCA TCGCGCCCGG CCGGCCCCGC GGCGCTCTGC 
5601 CCCGCCGCAT GACTCCGTGC CCGCATGTTT GTGCCCCCGG TGCTCCGTGC 
5651 GTCCGGGGGC GGGGTGGGCG TCGTGCCCGG GTGGTCCAGG GGTCACGCGG 
5 701 TGGTGTGTGC CAGTTCCTGG CCGAGGTGGT GGGCGAGCTG TGCGGGCGTG 
5751 GGGTTCTCGA CGATGGCGAC CATCGCGATC TCCATCCCGG T CAGCGTC AT 
5 801 CAGTGTCTTG GTGAGCTCAA GGGCGGTGAG GGAGTTGAGA CCGTTCTCGA 
5 851 GGAAGTTGCT GTCGTCGCTG AGGGTGGTGT TCAGAAGGGT GCCGGCCTGG 
5901 GTGCGGATGG TGTCGGTGAG GAGCTTCTCG CGCTCCTCGG GGGTGGCCGC 
5951 GGCGAGCTGC TTCTCC AG CT CGGTGGCGTC CTGGCCGGAG GTGTGGTCGG 
60 01 TGCTGGTCAT GACTGCTCCT GTGTGAGTGA GGTGTTGGCG GGGGTCACAC 
6051 CGCGGCGTGC GCGGTGTGGT CGTGCAGCCA GTAACGCGTG GCCTGGAAGG 
6101 AGTACGTCGG GAGGTCGATG GTCCGGGGGT GGGGGGTGCG CCGGACGAGA 
6151 GGGGTCCAGT CGACGGTGCC GCCCGTGGTG TGAAGCCGCG CGAGGG CGGT 

62 01 CAACAGGGCG CTTACGGCGG AGGTTTGCGT GCCTTCCGGT GAGAGCGCGC 
6251 CCAGGTGGAG AAGCGTGTGG GTCTCGGGGG TGGGGGG TG C GGTGGGGGCC 

63 01 GGCGAGGTGA GGTGGTGGTG CCAGTAGTCG GCGGAGGCGA TGGGGGTGTC 
63 51 GGCCGGGGCA GTGCTGGTGA GCGTGAGCGT GGCGCGTTGG AACGTC AG CT 
6401 GCTTCAGCAC GGGC TCGTAG GCGTCGGGCG GAGCCGGTTG TTCACCCTCG 
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64 51 GCGGCCTGGG CGGCAGCGGC GTGGGCGGCG GCCAGGCGGC ACGCGTCGTC 

6S01 GAGGGTCAGG ATTCCCGCGG CGTACGCGGC GGCGATGTGG CCGACACCGT 

6551 CGCCGGTGAG GGTGTGGGGG CGTACCCCCG TTTCCAGGAG CAGCCGCGCG 

6601 AGCGCGGTGT GGACCGCGAA GCGCGCCAGT TCGGAGTGGG GAGTGGGGAG 

6651 GGGAGTCGGC AGATGGGTGT CGAGGAGCGC GCGCGCTTCG TCGAAGGCGG 

67 01 ACGCGAAGAG CGGGAACGCC GAGTGGAACT CGGCACCTCC GAAAGCCG*CG 
6751 CCGAATGTAG CGCCGAATGT CGCGCCGGGT TTGGCTCCGG GTGCCGCCCC 

68 01 CGTCGTCACC CCGTCGGCCG GGCGGCCGTC GAAGTGCCAG GCGATCTTCT 
6851 TCGGGCCGGC CCCGGGCGTG GACCTGACCA GGTCCGGGTG GTCCTCTCCG 
6 901 GCGGCCAGGG CGCGGGCGGC GGCGAGGAGT TCGGTGTGGT CGGTGCCGGT 
6951 GAGGACGGCG CGGTGTTCCA GGGGGCTGCG GGTGGCGGCG AGCGAGTAGG 
70 01 CGACCTCGGC GGGGGAGGGC GCGGGGTCGG TGGCCGCCAG GTGGGTGACG 
7051 AGGGCCTTCG CCTGTGCCCG CAGGGCCTCG GGTGTACGAG CGGACAGGCT 
7101 CCAGGCCACC GGGAGTTCCG GGGCAACGGG CGACGTCTGG TCGCGGGCGG 
7151 CATCCGGCAC CGGAGCCTCG TCCACCGGCG GCTCTTCGAG GATGAGGTGC 
72 01 GCGTTCGTGC CGGACGTGGC GAAGGCGGAG ATGCCGACCC GGCGGGGCTC 

72 51 CTCGCGGCGG GGCCAGTCGA CCGCCTCGGT GAGCAGCCGT ACCGCGCCCT 

73 01 TCTTCCAGGC GGCGAGGGGC GTCGGGCGGT CGACGTGGAG GGTCGGCGGC 

73 51 AGGGTGCCGT GCCGGAACGC CTGGACCATC TTGATGAGCG CGGCCGCACC 

74 01 CGCGGCCCCC TGCGTGTGCC CCGTGTTGGA CTTGACGGAG CCGAGCCACA 
74 51 GGGGCCGGTC GGGGGAGCGG TCGGCGCCGT AGGTGG CG AG GAGGGCCTGG 
7501 ACCTCGATGG CGTCGCCGAT GGGGGTGCCC GTC CCGTGCG CCTCGACGGC 
7551 GTCGATCTGG TCCGGGGTGA GCCCGGCGTC GGCGAGGGCG GCGCGGATCA 

76 01 CATGCTGCTG GGAGGGGCCG TTGGGGGCGG CGAGGCCGTA TCCGGCGCCG 
7651 TCCTGGTTGA CCGCGGAGCC CCGGATGACG GCGAGCACCG GGTGGCCGTT 

77 01 CTTCCTGGCG TOGCCGAGCC GCTCAAGCAG GACGAGGCCG ACGCCTTCAC 
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77 51 CGAGGCCCAT GCCGTCGGCC GCGGCGGCGA ACGGTTTGCA ACGGCCGTCC 

7 801 TGCGCGAGCG ACTTCTGGTG GGCGAAGGCG TGGAAGGTGT GCGGCGTCGA 

7 8 51 CATGACGGTG CCGCCGCCGG CGAGGGCGAG GCCGCACTCC CCGCGGCGCA 

7 901 GCGCCTGGCA GGCCAGGTGG AGGGCGACCA GGGAGGACGA GC AGG CCGTG 

7 9 51 TCCACGCTGA TGGCGGGGCC CTCGAGGCCG AGGGCGTAGG CGATGCGGCC 
8001 GGAGACGAGG CTGCCGGACG TGCCGCCGCC CAGATAGGGC AGCAGCTCGT 

8 051 CGGGCGCGGT CTCGAGCCGT GTCGCGTAGT CGTGCCCGGT GGCGCCGACG 
8101 TAGACGCCGG TGAGGGTGGA GCGCAGGGTG TGCGGGGCGA TGTGGCCGCG 
8151 TTCGACGGTC TCCCACGCGA GGTGGAGCAT GAGGCGCTGG AGGGGTTCGG 
8 201 TGGCCACGGC CTCGGTGTCG CTGATGTCGA AGAAGCCCGC GTCGAAGCCG 

82 51 GCCGCGTCGT CCAGGAACCC GCCGAGCTCC GCGTACGGGC GTTCCTCGGG 

83 01 GAGTTCCCAG GCG CGGTCGT CGGGGAAGCC GGTGACGGCG TCGCGGCCCT 

83 51 CGGACACCAG ATCCCACAGG TCGTCCGGGG TGCGGGTCTT GCCGGGCAGC 

84 01 CGGCAGGCCA TGGAGACGAC GGCGATCGGC TCGTG CTGTG CGGCCTTCAG 

84 51 TTCGCGCAGC TGCTGCTGGG CCTGGTGGAG CTCGGCCGTC GTCCACTTGA 
8501 GGTATTCGAC GAGCTTCTCT TCGTTCGCCA CGGGAATGGT CAGCCTTCCT 

85 51 GTTCTCGCGC GTGAAGCCTC AGGTGGGACG AGGTCGGGCA AGGTGGGCAG 

86 01 GCAGGAGCCG CGCGCTGTGG GTGCCAGGGT CGCCGCGGCT GCTTAAGCGG 
8651 GTCTAACTCC CGCCTTGCCG CCGGGCATCG CCTCGCACGA GCGGGCCAGC 

87 01 AG CAGGAGGT CGGCGGCGAT CTCGTCGGGT GCGCCGGCGT GCAGATCGTG 
8 751 GTCGGAG CCC GGGT ACCAG C GCACGCTCAC CTGCTCCAGG GCCGCCTCGG 
8801 CGGCGGCCAC CCAGGCCCGT ACCTGGTCGG ACAGTTGGGG GATGGCGGGG 
8 851 ATGAGGGGCA GCAGCCGCAC CGGCACGGTG ACCTTGGGAT ACCAGTCGGC 
8901 CGGTGCCTCC CGTTGCAGGC CGGCGACGAT CGACATGACC TGTGTCGAGG 
8 951 TCAGGCGGGG GATGAGCAGG COGTCCGGCC CGACGCGGTA GTCCGCC AGG 
9O01 CGTGCCTCGA TGGACGTGGG CGACCAGTCG GGATGGGTGG CCCGCAGGTA 
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9051 GGCGCGCATG TCGGCGGCGC TGGTGGTGCC CTGCTGGGCG CGCCGGACCA 

9101 CGTCGGCGGT GCGCTCCCAG AAGGCGCGCA TCACCGGTCC GTCGAACTCG 

9151 TACCAGCCGC CGTCGAT CAG GGCGAGACCG GCCACCAGGT CCGGGTGCTC 

9201 GGCCGCCAGG CGCAGCGCGA GGTGCGCGCC CCAGGAGTGC CCGGCCACCA 

9251 GTGCGCCGGA CAGGTCGAGG GCGGTGACGG CCGCCACCAG GTCGGTGACG 

93 01 ACCGTCGCGT TGTCGTAC CC GTCGGGCGGG GTGTCCGACT CGCCGTGC?CC 
9351 GCGGTGGTCG ACGGCGTAGG CCGGGTGTCC GGCGGCGGCG AGACGGGCGG 

94 01 CGACCTCGTC CCACATCCGG GCGTTCGACA GCATGCCGTG CAGCAGCAGG 
94 SI AACGG ACGG C CCGGGGCTCC CGGCCCGTCC GCGGGCCGGT AC CTG AC ATT 

95 01 GAGGGAGACG GTCTGCGACA CGGGGATGCG GAGGTTCTTC ACAGGCGGGC 
9S51 CCTTGTGATC CCTTGTGCTG GGGGAGGAAA GCGGGGGCGG CACGCTCAGG 

96 01 GGCGCTGCGC GGTCGCGAAG ATGTATCCGA GCTCGGGCAT CTTGCCGAGG 
9651 GCCGCCTGGT TGTGCAGGAA CAGCTCGTAT CCCTCTACGC CGATGATGTC 

97 01 GACGTACTCG TCCCGGTGGG CGCGGATCCA CTCGACGTAA CCGTCGTAGG 
9751 TCTTGGCGGT CTCGCGGGTG ATGTCGGTCA GTTCGAGGAC GGTCCAGCCG 

98 01 GCGGCGCGGA AGATGTCGGG GTAGTCCCCG ATGTCGGTGA GCGCGGCGTA 
9851 GATCGTGGTG TCGCTGACGG TGGCGGTCCG GGGCCGGCTG GGATCGGGGT 
9901 TGAGGTAGAC CATGTCGGCG ATCGG CATC C GCGCGCCGGG CTTCACGACG 
9951 CGGTGGGCCT CGGTGAGCAC CTGCTGCTTG TCCGGCATGT GCAGCATGGA 

10001 CTCCAGGGCC CAGCAGTGGT CGAAGGAGCC GTCGTCGAAC GGCAGGTTCA 

10051 TGGCGTCGAC CTGCTCGAAG CGGACCCGGT CGGCGAGGCC GGCCTCGCGC 

10101 GCCCGGCGGT TGCCGCGCTC GACCTGGCGG GCGCTGACGG AGATGCCGAC 

10151 CACCTCGACG TCGCGGGCGC GGGGCAGCTG CATGGCCGGG GTGCCGTTGC 

10201 CGCAGCCGAT GTCGAGGACG CGGTCGCCGG GGGC CGGGT C GAGGCGGCGG 

10251 ATCATCTCGT CGGTCATCTG GACCATGGCC TCGTCGAACG TGGCCTGCTG 

103 01 CTCGCCGCCG TCGAACCAGT AGCCG TAGTG CAGATTGCCG TCTCCGAGCT 
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103 51 GAGTCATCAG GTCGAAGACC TTGTGGTCGT AGTAGTGGCC GATGTCGCTG 

10401 GGCTCGGGGG CG ACGGTCTT GTTCACCGTC GGGGG CTTCT TGGTCGTCGC 

104 SI GTTCTTCGTC ACGGCTTCAG CGTCACCGTG CGGCGGCAGG CGCCACAACC 

10501 CCACCCCCGC CCCTCAAAAG CCCCTATGGG CCCTCCTCGA CCGCCCCTAG 

10551 GGAGCTGCTC TTGACGCGTT CCATACGGAA CGGGTGGTAC CCCTCCGAAA 

10601 AAAATGAGAG TACGCTCCCA CTAGATATTG AGCTCT CTTT AGGAGGTCGA 

10651 CTCCCATGTC TGCTGATCTG GGTGCGCGGC GGTGGTGGGC CGTCGGTGCT 

10701 CTCGTACTCG CCTCGATGGT CGTGGGCTTC GATGTGACGA TCCTGAG CCT 

10751 GGCGTTGCCC GCCATGGCCG ACGACCTCGG CGCGAACAAC GTCGAGCTGC 

10801 AGTGGTTCGT GACGTCGTAC ACGCTGGTGT TCGCGGCCGG CATGATCCCG 

10851 GCCGGCATGC TCGGTGACCG GTTCGGACGC AAGAAGGTCC TGCTCACCGC 

10901 CCTGGTGATC TTCGGTATCG CCTCGCTGGC CTGTGCCTAC GCGACGTCCT 

10 951 CCGGCACCTT CATCGGCGCG CGTGCGGTGC TCGGTCTGGG CGCCGCGCTG 

11001 ATCATGCCGA CGACGCTGTC GCTGCTGC CG GTCATGTTCT CCGACGAGGA 

11051 GCGGCCGAAG GCCATCGGAG CGGTGGCCGG TGCGGCGATG CTCGCCTATC 

11101 CGCTCGGCCC GATCCTCGGC GGCTACCTGC TCAACCACTT GTGGTGGGGC 

11151 TCCGTCTTCC TGATCAACGT GCCGGTGGTG ATCCTCGCCT TCCTCGCGGT 

112 01 CTCCGCCTGG CTGCCCGAGT CCAAGGCCAA GGAGGCCAAG CCGTTCGACA 

112 51 TCGGCGGCCT GGTGTTCTCC AGCGTCGGTC TCGCCGCGCT GACCTACGGC 

11301 GTGATCCAGG GCGGCGAGAA GGGCTGGACG GACGTCACCA CGCTGGTGCC 

11351 GTGCATCGGC GGTCTGCTCG CCCTCGTGCT GTTCGTGATG TGGGAGAAGC 

114 01 GGGTGGCGGA CCCGCTGGTC GACCTCTCGC TGTTCCGCTC GGCCCGGTTC 

114 51 ACCTCCGGCA CCATGCTCGG CACCGTCATC AACTTCACGA TGTTCGGCGT 

11501 GCTCTTCACG ATGCCGCAGT ACTACCAGGC GGTCCTCGGC ACCGACGCGA 

11551 TGGGCAGCGG CTTCCGGCTG CTCCCGATGG TCGGCGGTCT GCTCGTGGGT 

11601 GTGACGGTCG CCAACAAGGT CGCCAAGGCC CTCGGCCCGA AGACCGCGGT 
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11651 CGGCATCGGC TTCGCCCTCC TCGCCGCCGC CCTGTTCTAC GGCGCCACCA 

11701 CGGACGTCAG CAGCGGCACC GGCCTGGCGG CCGCCTGGAC CGCGGCCTAC 

117 51 GGACTCGGCC TCGGCATCGC CCTGCCGACC GCCATGGACG CCGCCCTCGG 

11801 CGCGCTCTCC GAGGACTCCG CCGG CGTCGG ATC CGG CGTC • AACCAGT CCA 

11851 TCCGTACCCT CGGCGGCAGC TTCGGCGCGG CCATCCTCGG TTCCATCCTC 

11901 AACTCCGGCT ACCGCGGCAA GCTCGACCTC GACGGCGTGC CCGAGCAdGC 

11951 ACACGGCGCG GTCAAGGACT CCGTCTTCGG CGGCCTCGCG GTGGCCCGGG 

12 001 CGATCAAGTC CAACGGACTG GCCGACTCGG TGCGTTCCGC GTACGTCCAC 

12 051 GCCCTGGACG TGGTGCTCGT GGTCTCCGGC GGCCTCGGAC TGCTGGGTGT 

12101 GGTG CTGGCG GTGGTGTGGC TGCCCCGCCA TGTTGGTCAG AGCACCGCCA 

12151 AGACAGCAGA ATCTGAGCAT GAAGCCGCAG ACGCAGTCTG ACCAGGGCAA 

122 01 AACAGTGCCT GGTCTGAGAG AACGCAAGAA GGCCCGGACG AAGGCCGCGA 
12251 TTCAGCGGGA GGCGGTGCGC TTGTTCAGGG AACAGGGCTA CACCGCCACG 

123 01 ACCATCGAGC AGATCGCCGA AGCCGCCGAG GTCGCTCCCA GCACCGTCTT 
12351 CCGCTACTTC GCGACCAAGC AGGACCTGGT CTTCTCGCAC GACTACGATC 

124 01 TGCCCTTCGC GATGATGGTC CAGGCCCAGT CACCCGACCT GACGCCGATC 

124 51 CAGGCCGAGC GGCAGGCCAT CCGCTCGATG TTGCAGGACA TCAGCGAGCA 

125 01 GGAACTGGCC CTGCAGCGCG AGCGGTTCGT CCTGATTCTC TCCGAGCCGG 
12551 AGCTCTGGGG CGCCAGCCTC GGCAACATCG G C C AG ACCAT GCAGATCATG 
12601 AG TG AG CAGG TGGCCAAACG GGCCGGGCGC GACCCGCGGG ACCCCGCGGT 
12651 CCGCGCCTAC ACCGGAGCCG TGTTCGGAGT GATGCTCCAG GTCTCGATGG 
127 01 ACTGGGCCAA CGATCCGGAC ATGGACTTCG CGACCACGCT GGACGAGGCA 
12751 CTCCACTACC TGGAAGACCT GCGGCCCTGA CCGAAGGGGC GGGCGCACAC 
12 801 CACAGAGCCC GCCCCGGCCA GACGTCGTAC GAGGCGCCAT CGGCCGTCGC 
12851 GTACGACCCC CGCGCCCCGG ATTCCCCCGC GGGGCGCGGG GTCAAGGGAA 
12 901 AAGAGACGAC CGCACGCGGC, CACTGTTCCC CCGGCTGCCG CGTCCGGTCC 
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12 951 AAC C TGGCG T GCTCCGGCTT 

13 0 01 GGCCCTCTCC CGGCGGCTCC 
13 051 TCGGTCACGA CGGCCGCCAC 
13101 CCCGCCGGGC AGCACCCGCA 
13151 ACG TGGCGGC CTGCTCCGGC 
13201 GCCGTGATCG GGCAGTCGAG 
13251 CACGGCCCGG TAGTCGGCCC 
13 3 01 CGGGACTGCG GAAGAACCGC 
13 3 51 GCCAGGATGT CCGCGTCCCC 
134 01 GGGCCGCGCG AGCCCCCCGG 
13 4 51 CGGCCGGTCC CCGCAGCCGC 
13 501 AGGCTGTGCC CGAACAGCGC 
13 551 CACGACGCCG TCGGCGAGCT 
13 601 GACGGTCCTG CCGCCCCGGA 
13 651 GCGAGCAGCC CGGAGAGCCC 
13 7 01 CGGAAAGCAG ACCAGCCGCA 
137 51 GCAACCACAC CCCGTTTCCG 
13 801 GGTGCCCGCG CCGCCGTGCC 
13 851 GCTCCGCGGG CGCTGTCCTG 
13 901 CCCCGATGCT GGCCAAACCC 

13 951 GGCCCATAGC GCCCGGCTAA 

14 001 GCCTTTAGAC AGCCCACCCA 
14 051 ATTTOGGACC GGGAGCGCCG 
14101 GCGCGACCGC AGCTGACGTG 
14151 GGACCGAGCG CAGGACCCGA 
14 201 GCTGCCCGGA GCACCTGACC 



CCCTCGACGG AGCACGCCAG GGGTCTGTCC 
CGTCAGACGC CCGGCCCCGC CGTCAGCGCC 
CTGCTCCTGA CAGCCGTCGA GGTAGAAGTG 
GATCGAACGC CGCGCCGGTC CGCTCCCGCC 
GACGTCCGCT CGTCGGCGTC CCCGATCAGC 
CCGGCCGGGT CCCGGCGCCT CGTAGGTGbc 
GCAGCGCGGG CAGGACGAGC TCCTGCAGCT 
TCGTCGGTGC CGCCCATCGC CCGCAGATGG 
GAACGCCCCC GAACGCCCCG CGGGACGGTA 
AAACGAACAG GTGCACGGGA AGGCCGGGCC 
CGCGCCACCT CGAACGCCAC GATGGCGCCC 
GAATGGCTTC CCGTCGCACG GCAGGTGGGG 
CGGCCACCGA CGCCAGGCAC GGCTCCG CAT 
TACTGCACGG CGAGCACCTC G AC G CCGGGC 
GAAGTAGTAA CTCGCCGAAC CGCCCGCGAA 
CCGGCGCCTC TGCCGCAGCG TGGTACCGCC 
GTGGCTGCAC CGAACTCGTC AC CG ATCTGT 
CCTGTCCATC GTTCTCCCTC TCCTCGCGTC 
CCCCGCCCCG AAAGCC CG AT GCCGGCCAAG 
CGATGCCGGC CAAGCCCCGA TGCTGGCCGC 
AGCCGCAGGC GGCTAGCCGG GGTTTGGTTC 
CGATGAGCCC GGTACTCGAA GCGATCTCCG 
TTGATGTTTT GTGGCAGCCA GTTGTTCAGC 
ATGGCCGCAT CCGCGTCAGC GTCCCCCTCG 
CCCGATCGCC GTCGTCGGGA TGGCCTGCCG 
CCGACGCGTT CTGGCGGCTG CTCAGCGAGG 
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142 51 GGCGCAGCGC GGTGAGCACC GCACCGCCCG AGCGGCGGCG AGCCGACTCC 

143 01 GGCCTCCACG GGCCGGGCGG CTACCTGGAC CGGATCGACG GCTTCGACGC 
14 351 GGACTTCTTC CACATCAGCC CGCGCGAGGC CGTGGCGATG GACCCCCAGC 

144 01 AGCGGCTGCT CCTCGAACTG AGCTGGGAGG CCCTCGAAGA CGCGGGCATC 
144 51 CGGCCGCCCA CCCTGGCGCG CAGCCGCACC GGCGTCTTCG TCGGCGCGTT 
14501 CTGGGACGAC TACACCGACG TCCTGAACCT GCGGGCG CCG GGCGCCG-fCA 
14 551 CCCGCCACAC CATGACCGGC GTGCACCGCA GCATTCTGGC CAACCGCATC 
14 601 T CGTACGCGT ACCACCTGGC CGGTCCGAGC CTCACCGTCG ACACCGCACA 
14 651 GTCCTCCTCG CTCGTCGCCG TCCACCTGGC CTGCGAGAGC ATCCGCAGCG 
14 701 GCGACTCCGA CATCGCCTTC GCGGGCGGCG TCAACCTCAT CTGCTCGCCG 
14 751 CGCACCACCG AGCTGGCCGC GGCCCGCTTC GGCGGTCTCT CGGCCGCAGG 
14801 CCGCTGCCAC ACCTTCGACG CCCGCGCCGA CGGTTTCGTA CGCGGCGAGG 
14851 GCGGCGGCCT CGTGGTGCTC AAGCCCCTCG CGGCGGCACG GCGCGACGGC 
14 901 GACACGGTGT ACTG CGTGAT CCGGGGGAGC G C CGTCAAC A GCG ACGGTAC 
14 951 GACCGACGGA ATCACCCTGC CCAGCGGGCA GGCGCAGCAG GACGTGGTGC 
15001 GCCTCGCCTG CCGACGGGCG CGGATCACGC CGGACCAGGT GCAGTACGTC 
15051 GAACTGCACG GCACCGGCAC GGCCGTCGGG GACCCGATCG AGGCCGCCGC 
15101 GCTCGGCGCC GCCCTCGGGC AGGACGCCGC CCGCGCCGTG CCGCTGGCCG 
15151 TCGGCTCCGC CAAGACGAAC GTCGGCCACC TCGAAGCCGC CGCCGGAATC 
15201 GTCGGACTGC TCAAGAGCGC CCTGAGCATC CACCACCGGC GGCTGGCGCC 
15251 GAGCCTGAAC TTCACCACCC CCAATCCGGC CATCCCGCTC GCCGACCTCG 
153 01 GCCTGACCGT CCAGCAGGAC CTGGCCGACT GGCCGCGCCC CGAACAGCCC 
153 SI CTGATCGCCG GGGTGTCGTC CTTCGGCATG GGCGGCACGA ACGGTCACGT 
15401 TGTCGTGGCG GCGGCGCCCG ATTCGGTGGC GGTACCTGAG CCGGTGGGGG 
15451 TGCCTGAGGG GGTGGAAGTG CCTGAGCCGG TGGTGGTTTC TGAGCCGGTG 
15501 GTGGTGCCGA CGCCATGGCC CGTGAGCGCT CACAGCGCTT CCGCGCTGCG 



-12- 



o <•,< *?<mm 0 MMSl 

1 1 SEPTEMBER 22C2 

15551 CGCGCAGGCC GGTCGCCTGC GGACGCACCT CGCCGCCCAC CGCCCCACCC 

15601 CCGACGCCGC GCGGGTCGGC CACGCGCTCG CCACCACCCG TGCGCCCCTC 

15 651 GCCCACCGCG CGGTCCTGCT CGGCGGCGAC ACCGCCGAAC TGCTGGGCTC 

15701 CCTGGACGCG CTGGCCGAGG GCGCGGAGAC CGCGTCCATC GTGCGCGGCG 

15751 AGGCGTACAC CGAGGGCAGG ACGGCCTTCC TCTTCAGTGG GCAGGGAGCG 

15 801 CAACGCCTCG GCATGGGGCG GGAGTTGTAT GCCGTGTTCC CCGTCTTCGC 

15 851 CGACGCTCTC GACGAGGCGT TCGCCGCCCT GGACGTACAT CTGGACCGCC 

15 901 CACTGCGCGA GATCGTCTTG GGCGAGACCG ACTCGGGTGG GAACGTCTCG 

15 951 GGTGAGAATG TCATCGGCGA GGGTGCCGAC CATCAGGCAC TCCTCGACCA 
16001 GACCGCCTAC ACCCAGCCCG CGCTCTTCGC GATCGAGACG AG C CTGTAC C 
16051 GGCTGGCAGC CTCCTTCGGC CTGAAGCCGG ACTACGTCCT CGGCCACTCG 
16101 GTCGGCGAGA TCGCCCCCGC GCACGTCGCC GGTGTCCTCT CGTTGCCGGA 
16151 CGCGAGCGCT CTGGTGGCCA CGCGGGGACG GCTCATGCAG GCGGTTCGCG 

16 201 CGCCCGGCGC GATGGCCGCG TGGCAGGCCA CGG CGGACG A GGCGGCCGAA 
162 51 CAGCTCGCCG GGCACGAGCG GCACGTCACC GTGGCCGCCG TCAACGGCCC 
16301 CGACTCCGTG GTCGTCTCCG GCGACCGCGC CACCGTCGAC GAACTGACCG 
16351 CCGCCTGGCG GGGACGCGGC CGCAAGGCCC ACCACCTGAA GGTC AG C C AC 
164 01 GCCTTCCACT C CCCGC AC AT GGACCCC ATC CTCGACGAGC TGCGCGCGGT 
16451 CGCCGCCGGC CTGACCTTCC ACGAGCCGGT CATTCCCGTC GTCTCCAACG 
16501 TCACCGGTGA ACTGGTGACC GCGACCGCGA CCGGGAGCGG CGCCGGGCAG 
165 SI GCCGACCCCG AGTACTGGGC GCGGCATGCG CGCGAGCCCG TGCGGTTCCT 
16601 GTCCGGGGTG CGGGGGCTGT GCGAGCGCGG GGTGACCACG TTCGTCG AG C 
16651 TCGGCCCGGA CGCACCGCTG TCCGCGATGG CCCGCG ACTG CTTCCCCGCC . 
16701 CCCGCGGACC GGAGCCGTCC GCGCCCCGCC GCCATCGCCA CATGCCGCCG 
16751 CGGGCGCGAC GAGGTGGCCA CGTTCCTGAG GTCGCTGGCC CAGGCGTACG 
16801 TCCGOGGCGC CGATGTCGAC TTCAC CCGGG CCTACGGCGC CACCGCCACG 
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16851 CGCCGCTTCC CCCTCCCCAC GTATCCCTTC CAGCGCGAGC GCCATTGGCC 

16 901 TGCCGCTGCC GGGGTGGGGC AGCAGCCGGA GACCCCGGAA CTTCCGGAAT 
16951 CCTCGGAGTC CTCGGAGCAG GCAGGGCATG AGCGGGAGGA GGGGGCGCGC 

17 001 GCGTGGGGCG GGCCTGAAGG GCGGCTTGCC GGGCTCTCCG TGAACGACCA 
17 051 GGAGCGGGTC CTCCTCGGCC TGGTCACCAA GCACGTGGCC GTCGTGCTCG 
17101 GGGACGCCTG GGG CACGGTA CAAGCCGCCC GCACCTTCAA GCAGTTGGGC 
17151 TTCGACTCGA TGGCCGCCGC CGAGCTGAGC GAACGGCTCG GCACGGAGAC 
17201 GGGCCTGCCG TTGCCCGCCA CCCTCACCTT CGACTACCCG ACCCCTCTGG 
172 51 CCGTCGCCGC GCACCTGCGC GCGGAGCTCA CCGGTACGCC CGCCCCGGCC 
17 3 01 GGCTCCGCGC CCGCCACGGG CGCCCTCGGC GCGGGTGACC TCGGCACGGA 
17351 CGAGGACCCG GTCGCCATCG TGGCCATGAG CTGCCGCTAT CCCGGCGGCG 
174 01 CAGGCACGCC CGAGGACCTG TGGCGGCTGG TCGCGGACGG CGCCGACGCG 
17451 ATGGGAGACT TCCCCACCGA CCGCGGCTGG GACCTGGCGC GGCTGTTCCA 
17501 CCCCGACCCC GACCGGTCGG GCACCAGCTG CACGCGGCAG GGCGGATTCC 
17 551 TGTACGACGC CGCCGACTTC GACGCCGAGT TCTTCGACAT CAGCCCGCGC 
17 601 GAGGCCCTGG CCGTCGACCC GCAGCAGCGG CTGCTCCTCG AGTGCGCCTG 
17 651 GG AGGCCTTC GAACGGGCGG GCCTGGACCC GCGGGCGCTC AAGGGCAGCC 
17701 CCACCGGCGT GTTCGTCGGC ATGACGGGGC AGGACTACGG CCCCCGTCTG 
17751 CACGAGCCGT CCCAGGCCAC CGACGGCTAT CTGCTGACCG GCAGCACGCC 
17 801 GAGCGTGGCC TCGGGCCGCC TGTCGTTCAG CTTCGGCCTT GAGGGGCCCG 
17 851 CCCTGACGGT GGACACGGCC TGCTCGTCGT CGCTGGTCAC GCTCCATCTC 
17901 GCGGCGCAGG CGCTGCGG CG CGGCGAGTGC GACCTGGCCC TCGCCGGCGG 

17 951 CGCCACCGTC CTGGCCACGC CGGGCATGTT CACCGAGTTC TCGCGGCAGC 

18 001 GGGGCCTGGC CCCCGACGGC CGCTGCAAGC CGTTCGCGGC GGGCGCCGAC 
180 51 GGCACGGGCT GGGCCGAGGG CGTGGGCCTG GTCCTCCTCG AAAGGCTCTC 
18101 CGAGGCCCGG CGCAAGGGGC ACGCCGTCCT CGCGGTGATC CGGGGTTCGG 
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18151 CGATCAACCA GGACGGCGCG AGCAACGGCC TGACCGCGCC CAACGGCCCC 

18201 TCGCAGCAAC GCGTCATCCG TGCCGCGCTC GCGGCCGCCC GGCTCACCGC 

18251 GGACGAGGTC GACGTAGTGG AGGCGCACGG CACCGGCACC ACGCTCGGCG 

18301 ACCCGATCGA GGCGCAGGCC CTGCTCGCCA CGTACGGCCA AGGGCGTTCG 

18351 GCGGAGCGGC CGTTGTGGCT CGGGTCGGTG AAGTCGAACA TCGGTCACAC 

18401 GCAGGCCGCC GCGGGTGTCG CGGGCGTCAT CAAGATGGTG ATGGCGATGC 

18451 GCCACGACCT GCTCCCCGCC ACCCTGCACG TCGACGAGCC GAGTGGCCAC 

18501 GTGGACTGGT CCACCGGCGC GGTGCGACTG CTCACCGAGC CGGTCGTCTG 

18551 GCCGCGCGGC GAACGTCCGC GCCGCGCCGC GGTGTCGTCC TTCGGGATCT 

18601 CCGGCACGAA CGCGCACCTG GTGCTCGAAG AGGCGGGGCA GGACGAGTAC 

18 651 GTTGCGGGAG CCGCCGACGA CGCCGGGCCG GTGGACGGTG CTGTG CTGCC 

18701 GTGGGTGGTT TCCGGACGGA CCGGAGCGGC GCTGCGCGAA CAGGCCCGCC 

18751 GTTTGCGTGA G TTGGTGACC GGCGGCTCGG CCGATGTCTC TGTGTCCGGG 

18801 GTGGGCCGGT CGCTGGTCAC CACGCGGGCG GTGTTCGAGC ACCGGGCCG T 

18851 GGTCGTGGGC CGCGACCGGG ACACGCTGAT CGGCGGCCTC GAGGCCCTTG 

18901 CGGCGGGTGA CGCGTCGCCG GACGTCGTGT GCGGGGTCGC GGGCGATGTC 

18951 GGCCCCGGCC CGGTGCTGGT GTTCCCCGGG CAGGGCTCGC AGTGGGTGGG 

19001 CATGGGAGCC CAACTCCTTG GCGAGTCCGC GGTGTTCGCG GCGCGGATCG 

19051 ACGCGTGCGA GCAGGCGCTG TCCCCGTACG TCGACTGGTC ACTGACAGAG 

19101 GTCCTGCGCG GGGACGGGCG CGAACTGTCG CGCGTCGACG TCGTCCAGCC 

19151 CGTGCTGTGG GCGGTGATGG TCTCGCTCGC CGCCGTCTGG GCGGACCACG 

192 01 GCGTCACCCC GGCCGCCGTC GTCGGGCACT CCCAGGGAGA GATCGCCGCT 

19251 GTGGTCGTCG CCGGCGCGCT CACCCTGGAG GACGGCGCCA AGATCGTGGC 

19301 CCTGCGCAGC CGGGCGCTGC GTCAGCTCTC GGGCGGGGGC GCCATGGCCT 

19351 CCCTCGGGGT GGGCCAGGAA CAGGCAGCCG AACTCGT CG A GGGCCACCCC 

19401 GGAGTGGGCA TCGCCGCCGT CAACGGCCCG TCATCGACCG TCATTTCAGG 
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19451 CCCGCCCGAG CAAGTCGCCG 

19501 TGAGAGGCCG CGTCATTGAC 

19551 GACGCCATCA CCGACGAACT 

19601 QACGGCCCCG GTGGCGTTCT 

19651 CGGCGGGCCT CGACACCGAC 

197 01 CGGTTCGCCG ACGCCGTCAC 
19751 CATCGAGGCC AGCAGCCACC 

198 01 TCGAGGAGGC CGGGGTCGAC 
19851 GACGGCGGCC GGGCACGCCT 
19 901 CGGGTGCGCG GTGAGGTGGG 

19 951 CCGTGGAGCT GCCGACGTAC 

20 0 01 GCCCCCACGG GCACCCAGGA 
20051 GCACCCGCTC CTCGGGGCGG 
2 0101 TGCTCACCGG CCGTATCAGC 
2 0151 ACCCTCTTCG GTGCCGCGGT 
2 0201 GCTGCGCGCC GCCGACGAGG 
2 0251 TGCGCACCCC, GCTGGTGCTG 
2 0301 GTGGTCGGCC CGGCCGACGC 
2 0 351 CGCCCGCCCC GACGGCAAGG 
2 0401 GCGAGGGTGC CTCTGAGGGC 
2 0451 TGGACCTGCC ATGCCGACGG 
2 0 501 CTCGGAGGAC TCCCCGGACA 
2 0 551 TCGACCTGGG CGACTTCTAC 
2 0601 GGACCGGTCT TCACGGGGCT 
2 0651 GTTCGCCGAG GCGGTGCTGC 
2 0701 GCATGCACCC GGCGCTCCTC 



CCGTCGTCGC CGACGCCGAG GCGCGCGAGC 

GTGGACTACG CCTCGCACAG CCCCCAGGTC 

CACCCACACC CTGTCCGGCG TCCGCCCCAC 

ACTCGGCCGT GACCGGAACC CGCATCGACA 

TACTGGGTCA CCAACCTGCG CCGCCCGGTC 

CGCGCTCCTC GCCGACGGCC ACCGGGTCTT 

CCGTCCTCAC CCTCGGCCTC CAGGAGACCT 

GCCGTCACCG TCCCCACCCT GCGGGGCGAG 

GGCCCGCTCG CTGGCACAGG CCTTCGGCGC 

AGAACTGGTT TCCGGCCACC GGTACGTCCA 

GCCTTCCAGC GTCGCCGTTA CTGGCTGGAG 

GGCGGCGGGC CTGGGCCTCG CCGCTGCGGG 

CCACCGAGAT CGCGGACGGC GACATCCGCC 

AGGCACAGCC ACCCCTGGCT CGCTCAGCAC 

CGTGCCCGCC TCCGTCCTCG CGGAATGGGC 

CCGGCTGCCC GCGTGTCGAC GACCTCACGC 

CCCGAGACCG CGGGCGTGCA GGTGCAGATC 

GCGGGACGGG CACCGCGACT TCCACGTCTA 

ACGCCTCTGA GGGCGAGGGC ATCGCCGAGG 

GAGGGTGCCT CCGGCGGCAC CGATG CGCCG 

CCGACTGGTC GCCGAGCCCA CCGGCACGGC 

CGGTGTGGCC GCCGCCCGGC GCCGAACCCG 

GAGCGGGCCG CCGCCACCGG AGTCGGCTAT 

GCGCGCCCTG TGGCGGCGGG ACGGCGAGCT 

CGCAAGAAGC CCCGGAAACC GCCGGGTTCG 

<3ACGCCGCAC TGCACCCCGC ACTCCTCGGC 
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2 0751 GAGCGGCCGG CCGAGGAGGA CAAGGTGTGG CTGCCGTTCA CGCTGACCGG 

2 0801 AGTGACCCTG TGGGCCACCG GTGCCACCTC TGTACGCGTC CGTCTCACCC 

2 0 851 CGCTGGACGA CGACCCCGAC GCGTCGGCGG ACGGGCGGGC CTGGCGGGTC 

2 0901 GGCGTGAGCG ACCCGACCGG CGCGGAGGTG CTGACCTGCG AGGCCCTGGT 

2 0951 CGCGGTGGCG GCGGGCCGCC GCGAGCTGCG GGCCGCGGGG GAGCGGGTGT 

210 01 CCGATCTGTA CGCGGTGGAG TGGGTG C CGG TGCCGGGCCC GGGGCCGcTtG 

210 51 GGTGAGGGTG CTGACTTCTC GGGCTGGGCC GGTCTGGGGG AGTGCGGGGA 

21101 GCGTTGGGAG TGCGTGGGGC GCGTGGAGCG CTGGTACGAG GACCTGGACG 

21151 CTCTCGGCGC GGCTGTCGAG GGTGGGGCTT CGGTGCCCTC TGTCGTTCTC 

212 01 GCCACCGCGG CTGCCGCCCC TGGTGGAGCG GGCGACGGAG CCGCCGATGC 

212 51 GCTGAGCGCG GTGCGG TGG A CCGGCGCGCT CCTCGATCAG TGGCTCGCCG 

213 01 ACGCGCGGTT CGCCGACGCC CGGCTGGTGG TGATCACGTC CGGCGCGGTC 
21351 GCCACGGGTG ACGATTTCCT TCCCGACCCG GCCGCCGCGG CGGTACGAGG 

214 01 ACTGGTCGAG CAGGCGCAGG TCAGGCACCC CGGCCGCATC CTCCTCGTCG 

214 51 ACACGGAAGC CGGGGCCGGG CTCGGGGTCG GCGCCGGAGT GGATGACGCG 

215 01 CTCCTGGAAC AGGCCGTGGC CATGGCTCTC GGCGCCGACG AACCGCAACT 

215 51 CGCCCTGCGC GCGGGG CGGG TCCTGGCGCC CCGCCTCACC GCACCCCAGG 

216 01 ATGCGGCCGT CACCGAAGCG GCGCGACCGC TCGACCCGGA CGGCACCGTA 

216 51 CTCATCACAG GGCCGGCCGG TGCTCCGGTG GCCGACCTCG CCGAACACCT • 
21701 CGTACGCACC GGGCAGTGCA GGCATCTGCT GCTCCTGCCT GGAGACGGTG 

217 51 AACTGGAGGA AATGGCCGAG GAG TTGCGGG GCCTCGGCGC CACCGTGGAC 

218 01 CTGAGTACCG CCGACCCGGC GGACCCGACC GCCCTCGCCG AAGTGGTCGC 
218 51 CGCCGTCGAG GGGGACCATC CTCTTACGGG GGTCATCCAC GCCACCGGAG 
21901 TCGTGGACGC GTTCGATCCC GGCGACTCGG CGAGCGACTT GATGATCGAC 
21951 TCGGCGAGCG ATTCGTTCGC CGAGGCATGG TCGTCGAGGG CGGGCGTCAC 
22001 CGCCGCACTG CACACCGCGA CCGCCCACCT TCCCCTGGAC CTGTTCGCGG 
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2 2 051 TCCTGTCCCC GGCGGGCGCG GACCTGGGCA TTGCCCGGTC GGCGGCCGCC 

2 2101 GCGGGCGCCG ACGCCTTCAG CGCGGCACTC GCCCTGCGCC GGCACACGAC 

22151 CGTCACGACG GACACGACAG CCCCGCCGCG CACGACAGCC CCGCCGCGAA 

2 2201 CGACAGCCTC GCCGCGCACG ACAGCCCTGT CGTCGTCGCG CACGACGGGC 

2 2251 GTGGCCCTCG CCTACGGGCC GCCCACCGCG CCGAGGCCCG GCATCAAGGG 

22 3 01 GACGGCGCCC GGTCGGATCC CCGTGCTGCT CGACGCCGCT CGCGCTCACG 

22 3 51 GGGGCGGTTC GCCCCTGCTC GGGGCCCGCT TGGCCGCGCG TGCCCTGGCC 

224 01 GCCGAGTCCG CCGCCGAGGG CGTCGCCGGC CTGCCCGCGC CGCTGCGCGC 

2 2451 GCTGGCAGTG GCCGCAGCCG CGGCCGGAGC ACCGACCCGG CGCACCGCCG 

22501 CCGACCGCAA GCCCCCCGCG GACTGGCCGG CCCGACTGGC CCCCCTGTCC 

22551 GCCCCCGAAC AACTCCGTCT G CTCATCG AC GCCGTACGCA CCCACGCCGC 

22601 CGCGGTCCTC GGCCGCACCG ACCCGGAAGC GCTGCGCGGG GACGCCACCT 

22651 TCAAGCAGCT CGGCCTTGAC TCGCTGACCG CCGTGGAGCT GCGCAACCGG 

22701 CTCGTGGAGG ACACCGGTCT GCGGCTGCCC ACCGCCCTCG TCTTTCGCTA 

22751 CCCGACCCCC GCGGCGATCG CCGCGCACCT CCGCGAGCGG CTGACCAGCC 

228 01 CGAGCGAGAC GACCGCCACA CAGAGGTCCG GAGGGCAGAC GCCCGCAGCG 

2 2 851 GGGCAGGCGT CGTCCGCGCT CGCCCCCGGC GGATCGGCCG CCGGACCGCC 

22 901 CGCCGCAGAC ACCGTGCTGA GGG AC CTG AC CCGCATGGAG AACACCCTCT 

2 2 951 CCGTGCTCGC CGCCCAGCTG CCCCACACCG AGACGGGTGA GATCACCACC 

2 3 001 CGGCTCGAAG CGCTCCTCAC GCGCTGGAAG ACCACGAACG CCACGGCGAA 

2 3051 CGACAGCGGC GACGGCAACG GCGGCGATGA CGACGCCGCC GAACGCCTCA 

2 3101 AGGCCGCGTC CGCCGACCAG ATCTTCGACT TCATCGACAA CGAGCTTGGT 

2 3151 GTCGGGCACG GCACCTCGCG CGTGACCCCC ACTCCGAAGG CCGGGTGACC 

23201 GCACATGGCG AGTGAAGAGC AACTGGTCGA AT ATCTG CGC AGGGTGACCA 

23251 CCGAGCTCCA TGACACGCGT CGGCGCCTGG TGCAGGAGGA GGACCGCAGG 

2 3 301 CAGGAACCGG TGGCCCTGGT CGGCATGGCC TGCCGCTTCC CGGGCGGCGT 
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2 3 351 GGCCTCACCG GAGGACCTCT GGGACCTGGT CGCCGCGGGC AAGGACGCCA 

234 01 TCGAGGACTT TCCCACCGAC CGGGGCTGGG AGCTGG AG G C GCTCTACGAC 

2 3451 CCGGACCCGG CCGCGTACGG GACCAGCTAT GTCCGCCACG GCGGG TTCGT 

23501 GGACGACGCG GGCTCCTTCG ACGCCGACTT CTTCGG CATC AGCCCGCGAG 

23551 AAGCCCTGGC GATGGACCCG CAGCAGCGGC TGATGCTGGA GACGTCCTGG 

23 601 GAGCTGTTCG AGCGCGCCGG CATCGAACCC GTCTCCCTCA AGGGCAGCftG 

23 651 TACGGGCGTC TACGCCGGGG TGTCCAGCGA GGACTACATG TCCCAACTGC 

2 3 701 CCCGCATCCC CGAGGGGTTC GAGGGGCACG CCACCACCGG CAGCCTCACC 

2 3 751 AG CGTCATCT CGGGCCGGGT CGCGTACAAC TACGGCCTCG AAGGCCCGGC 

23 801 CGTCACCGTC GACACAGCCT GTTCCGCCTC GCTCGTCGCC ATCCACCTGG 

23 851 CGAGCCAGGC GCTGCGCCAG CGTGAGTGCG ACCTCGCCCT CGCGGGCGGT 
23-901 GTGCTCGTAC TGTCCAGCCC GCTCATGTTC ACCGAGTTCT GCCGCCAGCG 
2 3 951 GGGCCTTGCT CCCGACGGCC GCTGCAAGCC .GTTCGCCGCC GCGGCGGACG 

24 001 GCACCGGCTT CTCGGAGGGC ATCGGTCTGC TCCTCCTGGA GCGCCTGTCC 
24 051 GACGCGCGCC GCAACGGCCA CAAGGTGCTC GCGGTGATCC GCGGCTCCGC 
24101 CGTCAACCAG GACGGCGCGA GCAACGGCCT GACCGCCCCC AACGACGCCG 
24151 CGCAGGAACA GGTCATCCGC GCCGCCCTCG ACAACGCCCG CCTCACCCCG 
24 201 TCCGAGGTGG ACGCCGTCGA GGCGCACGGC ACCGGCACCA AACTGGGCGA 
24 251 CCCCATCGAG GCCGGAGCGC TGCTCGCCAC CTACGGGCAA CACCGCGCCC 
24 301 GGCCCCTCCT CCTCGGCTCC CTCAAGTCCA ACATCGGCCA CACCCACGCC 
24 351 ACCGCGGGCG TCGCCGGTGT CATCAAGACC GTCATGGCGA TCCGCAACGG 
244 01 TCTGCTCCCC GCCACCCTCC ACGTCGAGGA ACTGAGCCCG CACGTCGACT 
244 51 GGGACGCGGG CGCGGTCGAG GTCGTCACGG AGCCCACCCC GTGGCCCGAG 
24 501 ACCGGCCACC CCCGGCGCGC GGGCGTCTCC GCGTTCGGGA TCTCCGGGAC 
24 551 G AATGCGC AC TTGATCCTGG AGGAGGCGCC GCCGGAGGAG GATGTGCCCG 
24 601 CCCCCGTGGT TGTGGAGTCG GGCGGGGTCG TTCCGTGGGT GGTGTCCGGG 
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24651 


CGGACGCCGG 


AGGCGCTGCG 


TGAAGAGGCC 


CGGCGACTCG 


GCGAGTTCGT 


24701 


GGCAGGCGAC 


ACGGACGCAC 


TGCCGAACGA 


GGTCGGCTGG 


TCCTTGGCCA 


24751 


CGACCCGGTC 


GGTGTTCGAG 


CACCGGGCTG 


TGGTCGTGGG 


GCGTGACCGG 


24801 


GATGCGTTGA 


CGGCTGGCCT 


GGGGG CGTTG 


GCTGCGGGTG 


AGGCTTCGGC 


24851 


GGGTGTGGTG 


GCCGGGGTGG 


CCGGTGATGT 


GGGTCCTGGG 


CCGGTGTTGG 


24901 


TGTTTCCGGG 


GCAGGGGGCG 


CAGTGGGTGG 


GCATGGGTGC 


CCAGCTGT*TG 


24951 


GACGAGTCTG 


CGGTGTTCGC 


GGCGCGGATC 


GCGGAGTGTG 


AG CGGGCCCT 


25001 


GTCGGCGCAT 


GTG.GACTGGT 


CGCTGAGTGC 


GGTGTTGCGC 


GGGGACGGGA 


25051 


GTG AG CTGTC 


CCGGGTGGAA 


GTGGTG C AG C 


CGGTGCTGTG 


GGCGGTGATG 


25101 


GTCTCGCTGG 


CTGCGGTGTG 


GGCGGATTAC 


GGGGTCACTC 


CGGCTGCCGT 


25151 


GATCGGGCAC 


TCGCAGGGTG 


AGATGGCTGC 


CGCGTGTGTG 


GCGGGGGCGC 


25201 


TGTCGCTGGA 


GGATGCGGCG 


CGGATCGTAG 


CGGTACGCAG 


TGACGCGCTT 


25251 


CGTCAG CTGC 


AAGGGCACGG 


CGACATGGCC 


TCGCTCAGCA 


CCGGTGCCGA 


25301 


GCAGGCCGCT 


GAG CTGATCG 


GTGACCGGCC 


GGGCGTGGTC 


GTCGCGGCGG 


25351 


TCAATGGGCC 


GTCGTCTACG 


GTGATTTCAG 


GGCCGCCGGA 


GCATGTGGCA 


25401 


GCCGTGGTCG 


CGGATGCGGA 


GGCACGTGGT 


CTGCGCGCCC 


GTGTCATCGA 


25451 


CGTCGG C TAT 


GCCTCGCATG 


GCCCCCAGAT 


CGACCAGCTC 


CACGATCTGC 


25501 


TGACCGAACG 


CCTGGCCGAC 


ATCCGGCCCA 


CGAACACGGA 


CGTGGCCTTC 


25551 


TATTCGACGG 


TCACCGCCGA 


GCGCCTGACG 


GACACCACGG 


CCCTGGACAC 


25601 


GGATTACTGG 


GTCACCAACC 


TCCGTCAGCC 


CGTCCGGTTC 


GCCGACACCA 


25651 


TCGAAGCCCT 


TCTCGCGGAC 


GGCTACCGCC 


TGTTCATCGA 


GGCCAGCGCC 


25701 


CACCCCGTGC 


TGGGCCTGGG 


CATGGAGGAG 


ACCATCGAGC 


AGG CGG AC AT 


25751 


GCCCGCCACC 


GTCGTCCCCA 


CCCTCCGCCG 


CGACCACGGC 


GACACCACCC 


25801 


AGCTCACCCG 


CGCCGCCGCC 


CACGCCTTCA 


CCGCCGGCGC 


CGATGTCGAC 


25851 


TGGCGGCGCT 


GGTTCCCGGC 


CGACCCCGCC 


CCCCGCACGA 


TCGATCTCCC 


25901 


CACCTACGCC 


TTCCAGCGCC 


GCCGCTACTG 


GCTGGCCGAC 


ACAGTGAAGC 
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25951 


GGGACAGCGG 


ATGGGACCCG 


GCCGGGTCGG 


GGCATGCCCA 


GTTGC CGACC 


26001 


GCGGTCGCCC 


TCGCCGACGG 


GGGAGTGGTG 


CTGAACGGCC 


GGGTGTCCGC 


26051 


CGAGCGCGGT 


GGCTGGCTGG 


GCGGGCATGT 


GGTGG CGGGG 


ACGGTTCTGG 


26101 


TGCCGGGTGC 


GGCGTTGGTG 


GAGTGGGTGT 


TGCGGGCCGG 


TGATGAGGCG 


26151 


GGTTGCCCCT 


CGCTTGAGGA 


GTTGACGCTC 


CAGGCGCCGT 


TGGTGTTGCC 


26201 


CGAGTCGGGT 


GGGTTGCAGG 


TTCAGGTGGT 


CGTGGGTGCG 


gctgatgaTsc 


26251 


AGGGCGGCCG 


TCGTGACGTA 


CATGTGTATT 


CGAGGTCTGA 


GCAGGACGCG 


26301 


TCGGCGGTGT 


GGCAGTGCCA 


TGCCGTCGGT 


GAG CT CGGGC 


GCGCGTCGGT 


26351 


GGCGCGGCCG 


GTGCGGCAGG 


CCGGGCAGTG 


GCCTCCGGCG 


GGGGCCGAGC 


26401 


CGGTGGAGGT 


GGG CGGCTTC 


TACGAGGGGG 


TCGCGGCCGC 


CGG TTACG AG 


26451 


TACGGTCCGG 


CGTTCCGTGG 


GCTGCGCGCG 


ATGTGGCGGC 


ACGGTGATGA 


26501 


CCTCCTTGCG 


GAGGTCGAGC 


TGCCGGAGGA 


GGCCGGTTCG 


CCGGCCGGTT 


26551 


TCGGCATCCA 


CCCGGCGCTG 


CTGGACGCCG 


CCCTGCACCC 


GCTGCTCGCA 


26601 


CAGCGGAGCC 


GGGACGGGGC 


CGGGGCGGGG 


GCCCACGGCG 


GGCAGGTGCT 


26651 


G CTG CCTTTC 


AG CTGGAGCG 


GTGTTTCCCT 


GTGGGCCAGC 


GAGGCCACCA 


26701 


GTGTGCGGGT 


GCGGCTCACC 


GGGCTGGGAG 


GAGGGGACGA 


CGAGACGGTG 


26751 


TCCCTG AGGG 


TAACCGACCC 


CGCCGGTGGC 


CCCGTGGTGG 


ACGTGGCAGA 


26801 


GCTGCGGTTG 


CGGTCGACGA 


GCGCCCGGCA 


GGTGCGGGGT 


TCGGCAGGCC 


26851 


CCGGCGCGGA 


CGGGCTCTAC 


GAGCTGCGGT 


GGACACCGTT 


GCCCGAGCGG 


26901 


CTTCCCGTAC 


CGGCCCCCGC 


GAACGGTCGC 


GATGTGGCCG 


CCGACCTGTC 


26951 


CGGATGCGCG 


GTGCTCGGCG 


AACTGGTCGC 


GGAACCGGGC 


CCGGGCATCG 


27001 


ACCTGGAGGG 


CTGCCCCTGC 


TACCCGGGCG 


TCGGCGCGCT 


CGCCGACAAC 


27051 


GCCTCCCCGC 


CCTCGATGAT 


CCTCGCCCCC 


GTGCACAGCG 


ACACCACAGG 


27101 


CGGCGACGGA 


CTCGCCCTGA 


CGGAACGGGT 


GTTGCGCGTC 


ATCCAGGACT 


27151 


TCCTGGCTGC 


ACCGAGTCTG 


GAACAGAAAC 


AGACGCGCCT 


GGCCTTCGTG 


27201 


AC CC GGGGCG 


CGGCGGACAC 


AGGTAGCACG 


ACGGGAGGCT 


CGGCTGCCCC 



, RCT/G3 0 0 . / 0 ? (1 7 2 

1 1 SFPTEMRFR 2000 



2 7251 GGCAGAGGCA GTCGACCCGG CGGTCGCGGC CGTATGGGGC CTAGTACGCA 

273 01 GCGCGCAGTC GGAGAACCCC GGCCGCTTCG TACTGCTGGA CACCGACGCG 

2 7351 CCCCTCGACC AGGCGTCCGT TGCCCCTCTC GTGGACGCGG TGCGGTCTGC 

2 7401 CGTGGAGGCG GACGAGCCCC AAGTCGCCCT GCGCGGGGGA CGGTTGCTCG 

2 7451 TGCCCAGGTG GGCGCGGGCC GGCGikGCCCG TCGAGCTGGC CGGGCCGGCC 

27501 GGAGCGCGGG CGTGG CGGCT GGTGGGCGGA GACTCCGGGA CGCTGGAGGC 

27551 CGTCGTGGCG GAGGCTTGCG ACGACATTGT GCTGCGCCCG TTGGCGCCGG 

2 7601 GCCAGGTCCG CGTCGCCGTC CATACGGCCG GGGTCAATTT CCGTGACGTC 

2 7651 CTGATCGCCC TGGGCATGTA CCCGGACCCG GACGCGCTGC CCGGCACCGA 

2 7701 GGCGGCCGGC GTGGTGACGG AGGTCGGGCC GGGCGTCACC CGTCTGTGGG 

2 7751 TGGGCGACCG CGTGATGGGC ATGATGGACG GCGCCTTCGG CCCGTGGGCC 

27801 ■ GTCGCCGACG CGCGCATGCT GGCCCCGGTC CCGCCCGGCT GGGGCACCCG 

2 7851 GCAGGCGGCC GCCGCTCCCG CCGCGTTCCT GACGGCTTGG TACGGGCTGG 

27 901 TGGAGCTGGC CGGTCTGAAG GCGGGCGAGC GTGTGTTGAT CCATGCCGCC 

27951 ACGGGTGGTG TGGGGATGGC GGCGGTGCAG ATCGCCCGGC ATGTGGGTGC 

2 8001 CGAGGTGTTC GCCACCGCGA GTCCGGG C AA GCACGCCGTG CTGGAGGAGA 

28051 TGGGCATCGA CGCCGCCCAC CGCGCCTCGT CGCGCGACCT CGCCTTCGAG 

28101 GACGCCTTCC GGCAGG CCAC CGACGGCCGT GGCGTGGACG TCGTCCTCAA 

2 8151 CAGCCTCACC GGTGAACTGC TCGACGCGTC CCTGCGATTG CTCGGCGACG 

28201 GCGGGCGCTT CGTGGAGATG GGCAAGAGCG ATCCGCGCGA CCCCGAGCTG 

28251 GTCGCG CTGG AGCACCCCGG GGTGTCG TAC GAGGCCTTCG ACCTCGTCGC 

2 8301 CGACGCCGGG CCCGAGCGGC TCGGGCTGAT GCTCGACAGG CTCGGCGAGC 

283 51 .TCTTCGCCGG CGGATCACTG GTACCGCTGC CGGTCACCGC ATGGCCGCTG 

28401 GGGCGGGCGC GAGAGGCGCT CCGCCACATG AGTCAGGCGA GGC AC ACCGG 

28451 CAAGCTGGTG CTCGACGTGC CCGCGCCGCT CGACCCCGAC GGCACCGTCC 

28501 TCGTCACCGG GGGTACCGGC ACCATCGGCG CCGCCGTGGC CGAACACCTG 
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2 8551 GCGCGTACCG GGGAGAGCAA GCACCTGCTC ATCGTCAGCC GCAGCGGGCC 

28601 GGCCGCCCAC GGCGCCGAGG AACTTGTCTC TCGTATAGCC GAGTTCGGGG 

2 8651 CCGAAGCCAC CTTCGTCGCT GCCGACGTGA GTGAGCCCGA CGCGGTCGCC 

2 8701 GCCCTGATCG AAGGGATCGA TCCGGCCCAT CCGCTGACCG GTGTCGTGCA 

2 8 751 TGCCGCCGGA GTACTCGACA ACGCTCTGAT CGGCTCCCAG ACCACCGAAA 

2 8 8 01 GCGTCACCCG CGTATGGGCG GCGAAGGCCG CCGCCGCGCA GCAACTCCAC 

28 851 GAGGCCACGA GGGAGTCGAG GCTGGGACTG TTCGTGATGT TCTCCTCCTT 

2 8 901 CGCCTCCACC ATGGGCACCC CAGGGCAGGC CAACTACTCC GCCGCCAACG 

28 951 CCTATTGCGA CGCGCTGGCC GCTCTCCGAC GCGCGGAGGG GCTCGCCGGC 

29001 CTGTCCGTGG CGTGGGGGTT GTGGGAGGCC ACCAGCGGCC TGACCGGGAC 

2 9051 G TTGT CGGCG GCCGACCGGG CCCGCATCGA CCGGTACGGC ATCAGGCCGA 

2 9101 CCAGCGCGGC ACGCGGCTGC GCCCTGCTGG CAGCGGCACG CGCCCACGGG 

29151 CGCCCCGACC TGCTCG CCAT GGACCTGGAC GCCCGCGTAC CCGCCGCGTC 

29201 CGACGCTCCG GTCCCCGCCG TGCTGCGCAC TCTGGCGGCC GCCGGAGCGC 

2 9251 CCGCCACCGC CCGTCCCACC GCGGCGGCGG CCGCTGACGG GGCGACGGAC 

2 9301 TGGTCCGGCA GGCTCG CCGG CCTCACCGAG GAGGCACGGC TCGAACTCCT 

29351 CACCGAGTTG GTGTGCACCC ACGCGGCAGG GGTGCTCGGG CACGCCGACG 

29401 CGGGCGCGGT CCAGGTGGAC GCGCCGTTCA AGGAACTCGG CTTCGACTCG 

2 9451 CTGACCGCCG TCGAACTGCG CAACCGGATC GCCGCCGCGA CCGGCCTGAA 

2 9501 ACTGCCCGCC GCCCTCGTCT TCGACTACCC GCAGGCTCGC GTTCTCGCCG 

2 9551 CCCACCTGGC CGAACGGCTC GTCCCGGAGG GCGCGGGGGC CATGGGCGGT 

29601 GTGAGCGGTG CGGAGGGCGT GAGGGACGCG TACGGGG C AG GCGGTCCGGG 

29651 CGGCGACATG ACCGCCCAGG TCTTGCTGGA GGTGGCCCGC GTCGAGCACA 

29701 CCCTGTCCGC CGCCGTCCCG CACGGCCTGG ACCGGGCGGC CGTCGCGGCC 

29751 CGCCTGGAGG CGCTGCTCGC CCGCTGCACG GCGACGACGG CGGCCACGGG 

2 9801 GGCCGCGGGA GCCGCGGTGG AGGGTGACGG CGACAGCGAC GGCGACGGCG 
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29851 CCGTGGATCA GCTGGAGACG 

2 9901 GACAACGAAC TCGGGGTGTG 
29951 ACGGGCGGGG AGCTGCAGCG 
30001 TACCTCAAGC GTGTCTCCGC 

3 0051 CGAGGCGGAG GAGCGCGGCC 
30101 GCCGCTACCC CGGCGGCATC 
3 0151 GCCGCGGGCG GCAACGCCCT 
3 0201 CCTGCGACGC CTCTTCCACC 
3 0251 CCCGCGAGGG CGGCTTCCTC 
3 0301 TTCGGCATCA GCCCCCGCGA 
30351 GCTCCTGGAG TGCGCCTGGG 
3 0401 GGTCCCTCCA GGGCAGCCGT 
3 0451 GGCTTCGGCA CCCCGCACAT 
30501 CGGCAGCGCC CCGAGCGTCC 
30551 TCGAAGGGCC CGCGGTGACG 

306 01 GCCGTGCACC TGGCGGCCCA 
3 0 651 GCTCGCGGGC GGTGTCACCG 
3 0701 TCTCGCGCGA GCGCGGCCTG 

307 51 GCCGCCGCGG ACGGCACGGC 
3 0801 GGAACGCCTC TCCGACGCCC 
30851 TCCGCGGCTC GGCCGTC AAC 
3 0901 CCCAACGGCC CCGCCCAGCA 
3 0951 GCGGCTCTCG CCCGCGGAGG 
31001 CCCGGCTGGG CGACCCCATC 
310 51 CAGGAGCGCC ACGGGGGCCG 
31101 CATCGGCCAC ACGCAGGGCG 



GCCACCGCCG AG C AAGT ACT GGACTTCATC 
AGCCGCGTGC CGGCCGCACA CCAGGCGATC 
CACATGGTGA GCGAAGAGAA ACTGGTCGAC 
GGACCTGCAC GCCACCCGGC AGCGGCTGCG 
AGGAACCCGT GGCCGTGGTG GAGGCCGCCT 
CGCACCCCCG AAGACCTGTG GGACCTGGTC 
GGGCGCCTTC CCCGACAACC GCGGCTGGGA 
CCGACCCCGA CCACCCCGGG ACGACCTACG 
CACGACGCCG AC C TGTTCG A CCCGGAGTTC 
GGCCGCGGTC CTCGACCCGC AGCAGCGACT 
AGGCACTGGA GCGCGCGGGC ATCGACCCGC 
ACCGGCGTGT ACGCGGGTGC CGCCCTGCCC 
CGACCCCGCC GCCGAGGGCC ACCTGGTCAC 
TCTCGGG C CG GCTCGCCTAC ACCTTCGGCC 
ATCGACACCG CCTGCTCGTC GTCGCTCGTC 
CGCGCTGCGG CAGCGCGAGT GCGATCTGGC 
TCATGACCAC CCCGTACGTG TTCACCGAGT 
GCCGCCGACG GCCGGTGCAA GCCCTTCGCG 
CTTCTCCGAG GGCGCCGGAC TCCTCGTACT 
GCCGGGCCGG CCACCGGGTG CTCGCCGTCA 
CAGGATGGCG CGAGCAACGG CCTCACCGCC 
GCGCGTGATC CGCGCCGCCC TCGCCGGGGC 
TGGACGCGGT CGAGGCGCAC GGCACCGGCA 
GAGGCCGACG CGCTCCTCGC CACCTACGGT 
GCCGCTGTGG CTCGGCTCGG TGAAATCCAA 
CGGCCGGTGC CGCGGGCCTG ATCAAGATGG 
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31151 TCCAGGCACT GCGGCACGAG ACGCTGCCCG CCACGTTGTA CGCCGACGAG 

31201 CCCACCCCGC ACGCCGACTG GGAGTCGGGC GCGGTGCGCC TGCTCAGCGC 

31251 GCCGGTCGCC TGGCCGCGCG GGGAGCACGG GGAGCACACC CGCAGGGCCG 

313 01 GCATCTCCTC CTTCGGCATC TCCGGCACGA ACGCCCACCT CATCCTGGAG 
31351 GAGGCGCCCG CGGCCGACGC CGAAGGAGCG GGTGGCGACG GCGATGGCGA 

314 01 CGGGGGAGGG GTGCGGCCGG TGGTGCGGGT CGGCGCCACG GGCCCCCGCG 

314 51 AAGAGCAGGG C CAAGG AC AG GGCCAAGAGC AGCACCAACA GCAACGTCAG 
31501 CAGCGGCAGC GGTCGTCGAT GATGCCGACG CCGCACCTCC CGTGGCTGCT 

315 51 GTCCGCCCGC AGCCCCGCCG CGCTCCGCGC CCAGGCCGAC GCGCTGGCGA 
31601 ACCATGTCGC CCACGCGGAC CACTCCATCG CCGACATCGG CGGCACACTG 
31651 CTGCGCCGCA CCCTGTTCGA GCACCGGGCG GTCG TCCTCG GAACCGACCG 
317 01 TGATGAG CGT GCCGCAGCGC TTGCCGCCCT CGCGGCAGGA CGCGCACACC 
31751 CCGCGCTGAC CCGGGCCGCA GGGCCGGCGA GGAACGGCGG CACCGCCTTC 
31801 CTGTTCACCG GCCAGGGAAG CCAACGCCCA GGCATGGGCA GGCAGTTGTA 
31851 CGACACCTTC GACGTCTTCG CCGAGTCGCT CGACGAGACC TGCGCCCGGC 
31901 TCGACCCCCT GCT CGAACAG CCGCTGAAGC CCGTCCTGTT CGCCCCCGCC 
31951 GACACCGCGC AGGCCGCCGT GCTGCACGGG AC CGGCATG A CGCAGGCCGC 
32 001 GCTGTTCGCC CTCGAAGTGG CCCTGTACCG CCAGGTCACC TCCTTCGGGA 
3 2 051 TCGCCCCCAG CCACCTGACC GGGCACTCCG TCGGCGAGAT CGCCGCCGCC 
3 2101 CACGTCGCCG GGGTGTTCTC CCTGGCGGAC GCCTGCACGC TGGTCGCGGC 
32151 CCGGGGCCGC CTCATGCAGG CCCTGCCCGC AGGTGGCGCC ATGCTCGCCG 
322 01 TCCAGGCGGC CGAGGACGAC G TACTGCCGC TGCTCGCCGG GCAGGAGGAA 
322 51 CGTCTCTCCC TCGCCGCCGT CAACGGCCCC ACCGCCGTCG TCGTGTC CGG 
32301 TGAGGCCGCT GCCGTCGGGG AGGTGGAGAA GGCGCTGCGC GGGCGCGGAC 
32351 TGAAGACCAA GCGGCTCAAC GTCAGTCACG CCTTCCACTC GCCGCTCATC 
3 2401 GAGCCGATGC TCGACGACTT CCGCGAAGTG GCCCGCGGGC TGACCTTCCA 
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32 451 CGCGCCGACG CTGCCCGTCG TCTCCAACCT CACCGGCCGC CTCGCCGACG 

3 2 501 CGGAGCTGAT GGCCGACGCC GAGTACTGGG TGCGGCACGT ACGCCGGCCG 

32 551 GTGCGGTTCC ACGACGGGCT GCGCGCTCTC AGCGAGCAAG GCGTCGTGCG 

32601 CTACCTGGAG TTGGGGCCCG ACCCGGTCCT CGCCACCATG GTCCAGGACG 

32651 GTCTCCCGGC CCCGGCGGAG GGAGAGGAGC CCGAGCCGGT CGTCGCCGCG 

32 701 GCGCTGCGCT CCAAGCACGA CGAGGGACGC ACCCTGCTGG GTGCCGTCGC 

32751 CGCGCTCCAC ACCGACGGAC AGCCGGCCGA CCTCACCGCC CTCTTCCCCG 

32801 CCGACGCCGG GCAAGTGCCG CTCCCCACCT ACCGGTTCCA GCGGCGACGG 

32851 TACTGGCGCG TCGCGCCCGA CGCCGCCGCG CCGGCCCGCG CCGCCGGCCT 

32901 CCAGGAGACC GGCCACCCGC TGCTGCCCGC CGTCATCCGG CAGGCCGACG 

3 2951 GCGGCATCCT GCTCGCGGGA CGCCTGTCCC TGCGTACGCA TCCATGGCTC 

3 3 001 GCCGACCACA CCATCGCGGG CGGCGTCCCG CTGCCCGCCA CCGCCTTCGT 

33051 CGAACTCGCC CTGCTCGCAG GGCGGCACGC CGCCTGCGAC ACGATCGACG 

33101 ATCTGACGCT GGAGACGCCG CTGCTGCTCG ACGACACCGG TACCGGTGTC 

33151 GGGG CGGCTG TGGGCGCGGG CGCCGATGCC CTCGTCGATG CCATAGAAGT 

33201 GCAGCTTGCC CTCGGCGCTC CCGACGGTTC CGGCCGCCGT GCTCTCACCG 

33251 TCCACTCCCG TCCTGCCGAC GATG CGGCTG ACGACGGCGA CGCGGCCGAC 

333 01 GCGGCCGATG CGGCAGGCCG GGGAGGCCCG GGCGGCTCGG GTGACCTGGG 

33351 CGATCCTGGC GATCCGGGCG ATCTG GGCG A CGGCGGGGGC TCCCGCGGCT 

3 3401 GGCGCCGTCA CGCCACCGGC ATCCTCAGCG CCGGCCCGGC CGCCGAACCG 

3 3451 GCCGCCCCCG ACGCCGCTCC CTGGCCGCCC GCCGACGCCA CCGCCCTCGA 

33501 CGTCGACGCG CTGTACGCCC GGCTCGACGC GCAGGG CTAC AGCTACGGGC 

3 3551 CCGCCTTCCG GGCCGTCCAC GCCGCCTGGC GGCACGGCGA CGACCTCTAC 

3 3601 GCCGATGTCC GCCTCGCCGA CGAACAGCGC GCTGAAGCCG ACGCGTTCGC 

3 3651 CCTCCACCCG GCCCTGCTCG ACGCOGCCCT GCATGCCGTC GACGAGCTGT 

33701 ACCGCGGCAG TGAGGGGCGG GGGCAGGAGC AGGGGCAGGG TGGTCAGGAG 
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3 3751 CCGGAGCAGG GCCGTGGCGA CGCGGACGCC CCGGTACGGC TGCCGTTCTC 

33 801 CTTCAGCGAC ATACGCCACC ACGCCACCGG GGCCACACGG CTGTGGGTCC 

33851 GCCTCAGCCC CCAGGGCGAC GATCGGCTGC GGCTGTCCCT GACCGACGGC 

33 901 GAGGGCGGGC AGGTCG CG AC AGTCGACGCC CTCCAACTGC GGTTGATCCC 

33 951 CGCCGACCGG TGGCGCGCGG CCCGCCCCAC CACAGCCGCC CCCCTGTACC 

34 001 ACCTGGACTG GCACGAGCTG CCGTTGCCCG AGCCGGCCGA GACGGACCCG 
34 051 GCCGCCCACT CCTGGGCTGT GCTCGGAGCG CACGACGCGG GCCTCGCTCC 
34101 CGCCGCGCAC TACCCGGACG TGGCGGCCCT GAAAGCCGCC GTCGAGGCCG 
34151 GCGAGCCCGT GCCGGACATC GTCTTCGCAC CGTTCCCCGC GGAGGGGACG 
34201 GAGACCGATG TCCCGGCTCA GGTACGAGCC CACGCCCGGC ACGCCCTGGA 
34 251 GCTGCTGCGC GACTGGCTCA CCACGGAAGC TTTCGCCGCC GCCCGCCTCG 

343 01 TCGTCCTCAC GACCGGTG CG GTCACCGCCC GCCCAGAGGA CGGGCCCGCC 
34 3 51 GACCTGGCCA CCGCACCTGT ATGGGGCCTG GTCCGAGCCG CCCAGGCCGA 

344 01 ACAACCCGAC CATGTCGTCC TGGTGGACAT CGACAAGGAC ATCGATAAGG 
344 51 ACACCGACGA GGAGACCGAC CAGGCCACCG ACGCGGGCAC CGCATCGCGC 
34 501 CACGCTCTGC CCGCCGCCTT GGCCGCGGCG GCCGCCCAAG CCGAGACACA 
34551 GCTCGCCCTG CGCGCGGGCA CCGTGCTCGT GCCGCGCCTC GCCGTCGTCC 
34 601 CGCCCCGGAC CGACACCCCA GCGCTGCACG CCACCGCCCC GGAGAGCACC 
346 51 ACGGACACTG TGGACTCCAC GGGCATCGCG GGCGCTGCGG AATCCGGCGG 
34701 CACCGTCCTG ATCACCGGCG GAACCGGCGG CCTCGGGCAG GCCGTCGCCC 
34 751 GTCACCTCGC CGCCGCGCAT GGCGCCCGCC ACCTGCTCCT CGTCAGCCGC 
3 4 801 AGGGGCGACG CCGCCGAGGG CGTCGCCGAG TTG CGCGCCG ACCTCGCGGA 
3 4851 CGACGGCGTC GACGTACGCG TCGCCGCCTG CGACATCACC GACCGCGACG 
3 4 901 CGCTGGCCGG GCTCCTCGCG GACATCCCCG CCGCGCACCC GCTCACCGCG 
3 4 951 GTCGTGCACA CCGCGGGCGT CATCGACGAC AG CCTC ATC A CGGCGATGAC 
3 5001 CCCCGAGCGG CTCGACGCCG TCCTCGCACC CAAGGCCGAC GCGGCCTGGC 
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3 5051 ACCTGCACGA ACTCACCCGC GACAAGGACC TGTCGGCCTT CGTCCTGTTC 

3 5101 TCCTCGGGCG CCTCCGTCCT CGGCAACGGC GGCCAGGCCA ACTACGCGGC 

3 5151 CGCCAACACC TTCCTCAACA CCCTCGCCGA ACACCGCCGC GCGGCCGGCC 

35201 TCGCCGCCAC CTCCGTGGCC TGGGG CCTG T GGGAGTCCGC GTCCGGCGGC 

3 5251 ATGGCCGCCC GGCTCGGCGA CGCCGACCGC GCCCGCATCC PJZCGCACCGG 

3 5301 CGTGACGGGC CTGACCGACG AGCAGGCCCT GGCCCTCTTC GACGCGGCCC 

35351 TGACCGCCGA GCACCCCACG GTCCTCGCCA CCCGCTTCGA CCGCGCCGTG 

3 5401 CTGCGCGGCC AGGCCGCCGC CCGCACCCTG CAGCCCGCCC TGCGCGGCCT 

3 5451 GGTACGCACT CCGCGCCCCA CCGCGTCCGC CGGGGCCATC GGGTCCACCG 

3 5501 CAGCCACCGG GTCCGCCACG GACGAGAACG CGCCCTCCTC GTGGGCCGCC 

3 5551 CGGCTCGCCC GGCTGTCCGC CGCCGACCGC GACCGCGCCC TCAACGAACT 

35601 CATTCGCGAG CAGATCGCGA CCGTCCTGGC ACACCCCTCA CCCGACACCA 

35.651 TCGAACTGGG CCGCGCCTTC CAGGAGTTGG GCTTCGACTC GCTCACCGCC 
35701 ' CTGGAACTCC GCAACCGCCT CTCCACGGCC ACCGGCATCC GGCTGCCCGC 

3 5751 CACCCTCGTC TTCGACCACC CGAGCCCCAC CGCCCTCGTA CGCCATCTCC 

35801 ACAGCCATCT CCCCGACGAG GCCCAGCACA CGTCCCCGAC CGCCCCCGGC 

3 5851 GCCTCTGCGG AGGGCACCGC CGCCACGGCC ACCGGCATCG ACGACGACCC 

3 5901 G ATCG C CATC GTCGGCATGG CGTGCCG C T A CCCGGGCGGC GTGACCTCGC 

35 951 CCGAGCAGCT GTGGCAGCTC GTGGCCACCG GCACCGACGC CATCGGCCCG 

3 6001 TTCCCCGAGG ACCGCGGCTG GGACACGGCC GGACTGTTCG ATCCCGACCC 

3 6051 CGACCAGGTC GGCCACAGCT ACACCCGCGA AGGCGGCTTC CTCTACGACG 

3 6101 CCGCCCGCTT CGACGCGGGC TTCTTCGGCA TCAGCCCGCG CGAGGCCGCC 

3 6151 GCCACCGACC CGCAGCAGCG CCTGCTCCTG GAAACCGCCT GGCAGGCGTT 

3 6201 CGAACACGCG GGCATCGACC CCGCCGCCCT GCGCGGCACC CCGTGCGGCG 

3 6251 TCATCACCGG AATCATGTAC GACGACTACG GATCCCGCTT CCTCGCGCGC 

36301 AAACCGGACG GCTTCGAGGG CCGCATCATG ACCGGCAGCA CGCCGAGCGT 
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3 6351 GGCCTCCGGC CGGGTCGCGT ACACCTTCGG CCTGGAGGGC CCCGCCATCA 

36401 CGGTGGACAC CGCGTGCTCC TCCTCGCTGG TCGCGATGCA CCTGGCGGCG 

3 6451 CAGGCGCTGC GGCAGGGCGA GTGCGAACTG GCCCTGGCCG GGGGTGTGAC 

3 6501 CGTGATGGCC ACCCCGAACA CCTTCGTGGA GTTCTCCCGC CAGCGCGGCC 

36551 TGGCCCCCGA CGGCCGCTGC AAGCCGTTCG CCGCCGCGGC GGACGGCACC 

3 6601 GGCTGGGGCG AGGGCGCCGG ACTCGTCGTC CTGGAGCGCC TCTCCGACGC 

36651 GCGCCGCAAG GGACACCGCG TCCTCGCCCT GCTGCGCGGT TCGGCCGTGA 

36701 ACCAGGACGG CGCGAGCAAC GGCATGACCG CCCCGAACGG TCCCTCGCAG 

36751 GAACGGGTCA TCCGCACCGC CCTGGCCGGC GCGGGCCGTG GTCCCGAGGA 

3 6801 CATCGACGTG GTGGAGGCGC ACGGCACCGG CACCACGCTC GGCGACCCGA 

36851 TCGAGGCGCA GGCCCTGCTC GCCACGTACG GGCAGGGGCG CCCGGAGGAC 

36901 CGCCCGCTCT GGCTCGGCTC GGTGAAGTCG AACATCGGCC ACACGCAGGC 

36951 CGCCGCCGGT GTCGCGGGCG TCATCAAGAT GGTCATGGCA CTGCGCCACG 

37001 AGCAACTGCC CACGACCCTG CACGCCGACG AGCCGACCCC CCACGTGCAA 

37051 TGGGACGGCG GCGGCGTACG TCTCCTGACC GAACCGGTCC CGTGGTCGCG 

3 7101 CGGCGAGCGC ACGCGGCGCG CCGGGGTGTC GTCCTTCGGG ATCTCCGGGA 

3 7151 CGAACGCGCA CCTGATCCTG GAGGAGCCGC CGGAGGAGGA CCTGCCCGAG 

3 7201 CCCGTGGCGG CGGAGCCGGG TGGGGTGGTG CCGTGGGTGG TGTCCGGGCG 

3 7251 GACGCCGGAC GCGTTGCGTG AACAGGCGCG GCGGCTCGGC GAGTTTGTCG 

37301 TCGGTG CCGG GGATGTGTCG GCAGCCGAGG TGGGATGGTC ACTGGCCACG 

373 51 ACGCGGTCGG TGTTCGAGCA CCGGGCCGTG GTGGCGGGCC GGGACCGGGA 

37401 CGATCTGGTT GCCGGGATGC AGGCGCTGGC GGCAGGGGAG ACGCCGACAG 

37451 ATGTCGTGTC CGGTGCGGCG GCTTCCTCCG GTGCGGGGCC GGTGTTGGTG 

37501 TTCCCGGGGC AGGGGTCGCA GTGGGTGGGC ATGGGTGCCC AGCTCCTTGA 

37551 CGAGTCCCCC GTCTTCGCGG CGCGGATCGC GGAGTGTGAG CAGGCGCTGT 

37601 CGGCGTACGT GGACTGGTCG CTGAGTGATG TCCTGCGCGG GGACGGGAGT 
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3 7651 GAGCTGTCCC GGGTCGAGGT CGTGCAGCCC GTGTTGTGGG CGGTAATGGT 

3 7701 CTCGCTGGCT GCCGTCTGGG CGGATTACGG GGTCACTCCG GCCGCTGTGG 

3 7751 TGGGGCATTC GCAGGGTG AG ATGGCTGCCG CGTGTGTGGC GGGGGCGCTG 

3 7801 TCGCTGGAGG ATGCGGCGCG GATTGTGGCG GTACGCAGTG ACGCGCTTCG 

37 851 TCAGCTGCAA GGGCACGGCG ACATGGCCTC ACTCGGCACT GGTGCCGAGC 

37 901 AGGCCGCTGA GCTGATCGGT GATCGGCCGG GAGTGGTCGT CGCGGCAGTC 

37951 AACGGGCCGT CGTCTACCGT GATTTCGGGG CCGCCGGAGC ATGTGGCCGC 

3 8001 TGTGGTCGCG GAGGCGGAGG CACGTGGTCT GCGCGCCCGT GTGATCGACG 

3 8051 TCGGGTATGC CTCGCACGGC CCCCAGATCG ACCAGCTCCA CGACCTCCTC 

38101 ACCGAGGGCC TGGCTGACAT CCGGCCCGCG AACACGGACG TGGCCTTCTA 

3 8151 TTCGACGGTC ACCGCCGAGC GCCTGACGGA CACCACAGCC CTGGATACGG 

3 8201 ATTACTGGGT GACCAACCTC CGCCAGCCGG TCCGGTTCGC CGACACCATC 

38251 GAAGCGCTTC TCGCGGACGG CTATCGCCTG TTCATCGAGG CCAGCGCGCA 

3 8301 CCCGGTGTTG GGCCTGGGCA TGGAGGAGAC CATCGAGCAG GCGGACATCC 

3 8351 CTGCCACGGT CGTCCCCACC CTGCGCCGCG ACCACGGCGA CACCACCCAG 

3 8401 CTCACCCGCG CCGCCGCCCA CGCCTTCACC GCCGGCGCCG ATGTCGACTG 

3 8451 GCGACGCTGG TTCCCGGCCG ACCCCACCCC CCGTACCGTC GACCTCCCCA 

3 8501 CCTACGCCTT CCAGCACCAG CACTACTGGC TGGAGGAGCC CAGTGGGCTC 

3 8551 ACCGGAGACG CCGCCGACCT CGGCATGGTG GCCGCCGGGC ATCCGCTGCT 

3 8 601 CGGTGCCTGT GTGGAACTCG CGGAGAGCGA CTCGTACTTG TTCACCGGGC 

3 8651 GGCTCTCGCG CAGGGCTCCG TCCTGGCTGG CCGAACACGT GGTGGCGGGG 

3 8701 ACGGTTCTGG TGCCGGGTGC GGCGTTGGTG GAGTGGGTGC TGCGGGCCGG 

3 8751 CGATGAGGCG GGATGCCCGA CGATTGAGGA ACTGACGCTC CAGGCGCCGT 

3 8801 TGGTGCTGCC CGAGTCGGGC GGGTTGCAGG TTCAGGTGGT CGTGGGTGCG 

3 8851 ACCGATGAGC AGAGCGGCCG TCGTGACGTA CACGTGTATT CGAGGTCTGA 

3 8 901 GCAGGACGCG TCGGCGGTGT GGGTGTGCCA TGCCGTCGGT GTGGTGAGCT 
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3 8951 CCGAAATGCC AGAAGCGGCA GCCGAGTTGA GTGGGCAGTG GCCTCCTGCC 

3 9001 GGGGCCGAAG CCGTGGATGT CGAGGACTTC TACGCGCGGG CCGCGGAGGC 

39051 CGGATACGCC TACGGTCCGG CGTTCCAGGG GCTGCGGGCG CTGTGGCGGC 

3 9101 ACGGGACGGA GCTGTTCGCC GAGGTGGTGC TGCCCGAACA GGCGGGTGGG 

3 9151 CACGACGGTT TCGGCATCCA CCCGGCGCTG CTGGACGCCG CCCTGCATCC 

3 92 01 GCTGATGCTC CTCGACCGGC CCGCGGACGG G C AG ATGTGG CTGCCGTTCG 

3 9251 CGTGGAGCGG GGTGTCGCTG AACGCGGACC GGGCGACCCA CGTCCGTGTC 

3 9301 CGGCTCTCCC CGCGGGGGGA GGCGGCCGAG CGTGACCTGC GGGTCGTCAT 

3 9351 CGCCGACGCG ACCGGCGCGC CCGTCCTGAC GGTCGACGCC CTGACCCTGC 

3 9401 GCGCGGCCGA TCCCGGCCGG CTGGGTGCGG CGGCCCGTGG CGG TGTCG AC 

3 9451 GGCCTC TACA CCGTCGACTG GACCCCGCTG CCCCTGCCCC AGCCCCTTCC 

3 9501 GCTGCCGCGG ACGGATGCAG GGGGGAGTGC CGACTGGGTC ATACTCTCGG 

3 9551 ACAACTCCAG TGCAGCTCTG GCTGATGCCG TGTCGTCCGC GACGGCGGCA 

3 9601 GGTGGCGG AG CGCCGTGGGC ATTGCTCGCT CCCGTGGGTG GCGGCTCTGC 

3 9651 CGATGACGGG CTGCCGGTGG TGCGGCGGAC CCTCTCCCTC GTACAGGAGT 

3 9701 TCCTGGCCGC CCCGGAGCTG ACCGAGTCCG GTCTCGTCAT CGTGACACGC 

3 9751 GGTGCCGTGG CCACCGACGC CGATGGTGAC GTCGCGGCGT CCGCGGCAGC 

3 9801 GGTATGGGGC CTGATCCGCA GCGCCCAGTC GGAGAACCCG GGCCGCTTCG 

3 9851 TCCTGCTCGA CGTCGAGGAG GAGCACCTCC ACCCGGACGG CGGGGAACTG 

39901 CCGTACGCCG CCCTGCGCCA CG CCG TAG AG GAGCTCGACG AGCCTCAACT 

3 9951 TGCCCTCCGC AGCGGCAAAT TCCTCGTACC GCGCATGACG CCCGCCGCCG 
40001 CCCCCGAGGA GCTCGTCCCG CCGGTCGGTA CGTCCGGCTG GCGCCTCGGC 

4 0051 ACCTCCGGTA CGGCCACCCT GGAGAATCTG TCGGTGATCG ACGCTCCCGA 
4 0101 GGCGTTCGCG CCGCTGGAGC CCGGGCAGGT GCGGATCTCC GTACGGGCGG 
40151 CGGGCATGAA CTTCCGTGAC GTGCTGATCG CGTTGGGCAT GTATCCCGAC 
4 0201 AAGGGCACGT TOGCGGGAAG CGAGGGCGCC GGACATGTGA CGGAGGTGGG 
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40251 


ACCGGGCGTC 


ACTCATCTGT 


CGGTCGGTG A 


CCGGGTGATG 


GGTCTGTTCG 


40301 


AGGGCGCGTT 


CGCTCCGCTG 


GCCGTCGCGG 


ACGCCCGGAT 


GGTCGTCCCG 


40351, 


ATTCCGGAGG 


GCTGGAGCTT 


CCAGGAGGCC 


GCGGCGGTGC 


CCGTGGTGTT 


40401 


CCTCACGGCC TGGTACGGCC 


TCGTGGACCT 


CGGCCGCCTC 


CGGGCGGGCG 


4 04 5 1 


AATCGCTGCT 


CATCCACGCG 


GGCACCGGCG 


GAGTGGGCAT 


GGCCGCCACC 


40501 


CAGATCGCCC 


GCCACCTGGG 


CGCCGAGGTG 


TTCGCCACCG 


CGAGCCCCGC 


A n c: c: i 


CAAGCACGGC 


GTGCTCGACG 


GCATGGGCAT 


CGACGCGGCC 


CACCGCGCCT 


fiUDUl 


CCTCCCGTGA 


CCTCGACTTC 


GAGGAGACCT 


TGCGGGCGGC 


GACGGGCGGG 




CGCGGCATGG 


ACGTCGTACT 


CAACAGTCTG 


GCCGGGGAGT 


TCACCGACGC 




CTCGCTGCGG 


CTGCTCGCCG 


AGGGCGGGCG 


CATGGTGGAC 


ATGGGCAAGA 


4U / pi 


CCGACAAGCG 


CGACCCCGAC 


CGGGTCGCGG 


CCGAGCACGC 


GGGCGCGTGG 


4 0 801 


TACCGGGCCT 


TCGACCTCGT 


GCCGCACGCG 


GGGCCCGACC 


GGATCGGGGA 


4 0851 


AATGCTG GCG 


GAGCTGGGCG 


AGTTGTTCGC 


CTGCGGCGCC 


CTGGCGCCGC 


4 0 901 


TGCCCGTGCA 


GACCTGGCCG 


CTGGGCCGGG 


CGCGTGAGGC 


GTTCCGGTTC 


4 0 951 


ATGAGCCAGG 


CGAAGCACAC 


CGGCAAGCTG 


GTGCTGGAGA 


TCCCGCCCGC 


41001 


V_ V~ J. V^^". i V^VJ 


GACGGCACGG 


TGCTCATCAC 


CGGCGGCACC 


GGGGTCCTCG 


41051 


CCGCCGCGGT 


GGC C GAG CAT 


CTGGTGAGGG 


AGTGGGGCGT 


ACGACACCTG 


41101 


CTGCTGGCCG 


GGAGGCGCGG 


TTCCGAGGCG 


CCCGGGAGCA 


GTGAACTCGC 


41151 


CGAGGAACTG 


ACCGAGTTGG 


GGGCCGAGGT 


GACCTTTGCC 


GCGGCCGATG 


41201 


TCAGTGATCC 


GGACGCCGTG 


GCGGAGCTCG 


TCGGCAAG AC 


LuA 1 LL-u^^Ij 


41251 


CACCCGCTGA 


CCGGTGTGAT 


CCACGCGGCC 


GGTGTGCTGG 


ACGACGCCGT 


41301 


GGTCACCGCA 


CAGACCCCGG 


AGAGCCTCGC 


GCGGGTG TGG 


GCGGCGAAGG 


41351 


CGACGGCCGC 


ACACCTGCTG 


CAOGAGGCGA 


CCCGGGAGGC 


GCG C CTCGGT 


41401 


CTCTTCCTGG 


TGTTCTCCTC 


GGCGGCGGCG 


ACACTCGGCA 


GTCCGGGACA 


41451 


GGCCAACTAC 


GCGGCGGCCA 


ACGCCTATTG 


CGACGCCCTC 


GTCCGGCAAC 


41501 


GGCGTGCCGA 


GGGCCTGGCC 


GGTCTCTCGA 


TCGGCTGGGG 


TCTGTGGCAG 
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1 1 SEPTEMBER 2009 

415 51 ACGGCGAGCG GCATGACCGG ACACCTCGGC GAGACGGACC TGGCACGCAT 

41601 GAAGCGCACC GGGTTCACCC CGCTGACCAC CGAAGGTGGC TTGGCCCTCC 

41651 TCGACGCCGC CCGCGCCCAC GGCCGCCCGC ACGTGGTCGC GGTGGACCTC 

41701 GACGCGCGCG CCGTCGCCGC GCAGCCCGCC CCGTCCCGGC CCGCGCTCCT 

41751 GCGCGCCCTG GCCGCGGGTG CGACCCCGGG GGCCCGCACC GCGCGGCGCA 

418 01 CCGCGGCCGC GGGCAGCGTC GCCCCGGCGG GCGGTCTCGC CGACCGGCTC 

418 51 GCCGGCCTGC CGCATCCCGA ACGGCGCCGG CTGCTGCTCG ACCTCGTACG 

41901 TGGCAACGTC GCCGGCGTCC TCGGGCACAG CGACCACGAC GCCGTCCGCC 

41951 CGG AC ACGTC GTTCAAGGAG CTCGG CTTCG ACTCCCTGAC CGCCGTGGAA 

4 2 001 CTGCGCAACC GGCTGGCCGC CGCCACCGGC CTGAAGCTGC CCGCGGCGCT 

4 2051 CGTCTTCGAC TACCCCGAGT CGGCCACCCT CGTCGACCAC CTCCTGGAGC 

4 2101 GTCTGTCGCC CGACGGCGCG CCGCCGCCCG TCAAGGACGC CGCGGACCCC 

4 2151 GTTCTCAACG ACCTCGGCAG GATCGAGTCC TCCCTGGACG CGCTCGCCCT 

4 2201 CGACGCGGAC GCGCGCAGCC GGGTCACCAG GCGTCTGAAC ACCCTGCTGT 

42251 CGAAGCTGAA CGGAGCCGCC ACCGCCGGCT CCCCGGCGGA CGTCACGGAC 

42 301 CTGGACGCGC TGGACGCGCT GGACGACGTG TCCGACGACG AGATGTTCGA 

42351 GTTCATCGAC CGAGAGCTGT GACCCCCCTG CCCGCCCCGT CCCCCTTCCC 

4 2401 CGCCCCCACG TTCCCCGTGC CCTTCGCTGA TGGAGAAGTG ACGTTCGATG 

4 2451 TCGAGTGCTG AAGAGTCGAG TCCTGATGTG TCCGGCACGG GTGTGTC CGG 

4 2 501 TACGGGAGAG TCCGCTACGG GTACGTCGAG TACGGAAGCC AAGCTTCGGC 

42551 AGTATCTGAA GCGGGTCACG GTGGACCTCG GCCAGGCCCG CCGGCGGCTG 

4 2601 CGCGAGGTGG AGGAGCGGGC CCAGGAGCCG ATCGCCATCG TCTCCATGGC 

4 2 651 GTGCCGCTTC CCCGGCGACA CCCGCACGCC CGAGGCCCTG TGGGACCTGG 

4 2701 . TCGCCGAGGG CGGCGACGCC ATCG ACG ACT TCCCCACCAA TCGCGGCTGG 

4 2751 GACCTGGAGA GCCTCTACCA CCCCQACCCC GACCACCCCG GCACCAGCTA 

42 801 CGTCCGACGC GGCGGGTTCC TGTACGACGC CCCCGCCTTC GACGCGTCGT 
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42851 


TCTTCGGGAT 


CAGCCCGCGC 


GAAGCCCTGG 


CCATGGACCC 


GCAGCAGCGG 


42901 


GTGCTCATGG 


AGACGGCCTG 


GCAGCTCCTG 


GAGCGGGCCG 


GC ATCGACCC 


42951 


GGCCTCGCTG 


AAGCTGAGCG 


CCACCGGCGT 


C T AC ATCGGC 


GCGGGCGTGC 


43001 


TCGGGTTCGG 


CGGCGCGCAG 


CCCGACAAGA 


CGGTAGAGGG 


CCACCTCCTG 


43051 


ACCGGCAGCG 


CGCTGAGTGT 


CCTGTCCGGC 


CGCATCTCCT 


TCACGCTCGG 


43101 


CCTCGAGGGC 


CCGTCGGTCA 


GTGTCGACAC 


GGCGTGCTCC 


TCCTCGCTGG 


43151 


TCTCCATGCA 


CCTGGCGGCC 


CAGGCGCTGC 


GGCAGGGGGA 


GTGCGATCTC 


43201 


GCGCTGGCCG 


GCGGTGTCAC 


CGTGATGTCG 


ACGCCCGGCG 


CGTTCACCGA 


43251 


GTTCTCCCGC 


CAGGGCGCGC 


TGTCTCCGGA 


CGGCCGCTCG 


AAGGCTTTCG 


43301 


CGGCCTCGGC 


CGACGGCACC 


GGTTTCTCGG 


AGGGCGCGGG 


ACTGCTCCTC 


43351 


CTGGAGCGGC 


TCTCCGACGC 


GCGCCGCAAC 


GGCCACAAGG 


TGCTCGCGGT 


43401 


GATCCGCGGC 


TCGGCCGTCA 


ACCAGGACGG 


CGCGAGCAAC 


GGTCTCACCG 


43451 


CCCCCAACGG 


CCCCTCCCAG 


GAACGCGTGA 


TCCGCGCCGC 


CCTCGCCAAC 


43501 


GCGGGCCTGG 


GCGCCGCCGA 


GGTCGACGCG 


GTCGAGGCAC 


ACGGCACCGG 


43551 


CACGAAGCTC 


GGCGACCCCA 


TCGAGGCCGG 


TGCGCTGCTC 


GCCACCTACG 


43601 


GCCGCGACAG 


GGACGAGGAC 


CGGCCGCTGT 


GGCTGGGCTC 


GGTCAAGTCG 


43651 


AACATCGGTC 


ACCCGCAGGG 


CGCAGCAGGC 


GTCGCGGGCG 


TCATCAAGAT 


43701 


GGTGATGGCG 


CTGCAGCGCG 


AACTGCTCCC 


CGCCACCCTG 


TACGTCGACG 


43751 


AGCCCACCCC 


GCACGTCGAC 


TGGTCCTCGG 


GCTCCGTCAG 


GCTCCTCACC 


43801 


GAACCGGTCC 


CGTGGACCCG 


CGGCGAGCGC 


CCGCGCCGCG 


CGGGCGTGTC 


43851 


CGCCTTCGGC 


ATGTCCGGGA 


CGAACGCCCA 


CGTGATCCTG 


GAGGAGGCAC 


43901 


CGCCCGAGGA 


GGCAGCGGCC 


G CGGAG AC AC 


CGGCGGAAGG 


GACAGGCGCA 


43951 


GTCGTCCCGT 


GGGTCGTCTC 


CGGCCGGGGC 


GAGGAAGCGC 


TGCGGGCCCA 


44001 


GGCCGCACAG 


CTCGCCGAGC 


ACGTGCGCG A 


CGACGACCAG 


CGGCCGGCGT 


44051 


CACCGCTGGA 


GGTGGGGTGG 


TCGCTCGCCA 


CGACACGGTC 


GGTGTTCGAG 


44101 


AACCGGGCCG 


TCGTCGTCGG 


GGACGACCGC 


GACGCGCTCC 


TCGACGGCCT 
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44151 CCGGTCGCTG GCGGCAGGTG AGGCGTCGCC GGACGTGGTG TCCGGGGCGG 

442 01 TCGGCCCCAC GGGGCCCGGG CCGGTCATGG TGTTCCCCGG CCAGGGCGGC 

44251 CAGTGGGTGG GCATGGGGGC CCGGCTCCTC GACGAGTCCC CGGTGTTCGC 

44301 GGCCCGGATC GCCGAGTGCG AG C AGGCC CT GTCGGCGTAC GTGGACTGGT 

44 351 CCCTGACCGA CGTGCTGCGC GGGGACGGGT GGGAGCTGGC CCGGATCGAC 

444 01 GTCGTCCAGC CCGTGCTGTG GGCCGTCATG GTCGCGCTCG CCGCCGTCTG 

44451 GGCGGACCAG GGAATCGAAC CCGCCGCCGT CGTCGGCCAC TCGCAGGGCG 

44 501 AGATAGCCGC GGCGTGCGTC GTGGGCGCCA TCTCCCTGGA CGAGGCGGCC 

44 551 CGCATCGTCG CCGTACGCAG TGTGCTGCTG CGGCAGCTGT CCGGACGCGG 

44 6 01 CGGCATGGCG TCCCTGGGGA TGGGCCAGGA GCAGGCCGCC GACCTGATCG 

44 651 ACGGACACCC GGGTGTGGTC GTCGCGGCCG TCAACGGGCC GTCGTCCACC 

4 4 701 GTCATCTCGG GCCCGCCCGA GGGCATCGCC GCCGTCGTCG CCGACGCCCA 

44751 GGAGCGGGGC CTTCGCGCCA GGGCCGTCGC CTCCGACGTC GCGGGCCACG 

44 801 GCCCGCAGCT GGACGCGATC CTGGACCAGC TCACGGAGGG CCTGGCCGGC 

44 851 ATCCGGCCCG CCGCGACCGA CGTCGCGTTC TACTCCACCG TCACCGCCGG 

44 901 GCACCTCACC GACACCAGCG AACTCGACAC CGCGTACTGG GTGCGGAACG 

4 4 951 TGCGCCGGAG GGTGCGTTTC GCCGACACGA TCGACGCGCT GCTCGCGGAC 

4 5001 GGGTACCGCC TGTTCATCGA GGTGAGCCCC CACCCCGTCC TCAACCTCGC 

45051 GCTGGAAGGC CTCATGGAAC GGGCGGCCGT GCCCGCCACG GTCGTGCCCA 

45101 CCCTGCGCCG CGACCACGGC GACACCAGCC AGCTCGCCCG CGCCGCGGCC 

4 5151 CACGCCTTCG CCGCCGGCGC GGACGTCGAC TGGGGGCGCT GGTTCCCGGC 

4 5201 CGACCCCGCC CCCCGTACCG TCGACCTGCC CAGCTACGCC TTCCAGCGCC 

4 5251 AGGACTTCTG GCCGGCCCCC GCCGGCGGGC GGTCCGG CGA CCCTGCCGGG 

4 5301 CTCGGCCTCG CCGCCTCCGG ACACCCGCTC CTGGGCGCCT CCGTGGGCCT 

4 5351 CGCGAGCGGC GACGTACACC TGCTGAGCGG GCGGGTGTCC CGGCAGTCCG 

4 5401 CCGCGTGGCT GGACGACCAC GTCGTGGCGG GCCAGGCCCT GGTGCCCGGC 
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4 5451 GCGGCGCAGG TGGAGTGGGT GCTGCGGGCC GGCGACGACG CGGGCTGCTC 

4 5501 CGCCCTGGAG GAGCTGACGC TCCAGACGCC GCTCGTGCTG CCCGACACCG 

4 5551 GCGGCCTGCG GATCCAGGTC GTCGTCGAAG CGGCCGACGC ACACGGCCGG 

4 5601 CGCGACGTCC GGCTGTTCTC CCGCCCCGAT GACGACGACG CCTTCGCGTC 

45651 GACGCACCCC TGGACCTGCC ACGCCACGGG CGTGCTCGCC CCCGCCCCGA 

4 5701 CGGACGGCAC CAACGGAACG CGGGACGCCG CCGACACCCT GGACGGCGCA 

4 5751 TGGCCCCCGG CCGACGCCGA ACCCGTCCCC GCCGACGACC TCTACGCGCA 

4 5801 GGCCGACCGC ACCGGATACG GCTACGGCCC CGCCTTCCGG GGCGTACGGG 

4 58 51 CGCTGTGGCG CCACGGCAAG GACGTCCTGG CCGAGGTGAC GCTGCCCAAG 

4 5 901 GAGGCCGGCG ACCCGGACGG CTTCGGTATC CACCCGGCCC TCCTCGACGC 

4 5951 CGTCCTGCAA CCCGCCGCAC TGCTGCTGCC CCCGACCGAC GCCGAACAGG 

4 6001 TCTGGCTGCC GTTCGCCTGG AACGACGTGG CGCTGCACGC CGTACGGGCC 

4 6051 ACCACGGTCC GGGTGCGCCT CACCCCGCTC GGCGAGCGGA TCGACCAGGG 

4 6101 GCTGCGCATC ACCGTGGCCG ACGCCGTGGG CGCGCCCGTG CTCACCGTCC 

4 6151 GCGACCTGCG CTCGCGCCCG ACCGACACAG GCCGCCTCGC CGCGGCCGCG 

4 6201 ACCCGCGACC GGCACGGGCT GTTCGACCTG GAGTGGATCG CGCCGGAGAA 

4 6251 CGCGGCGGAG AACGCGGCGG GTCCGGCCCG GGACGCGTCC GAAGGGTGGG 

4 6301 TGACACTCGG CGAGGACGCC GCGAGCCTCG CGGACCTGCT GGCGTCCGTC 

4 6351 GAGGCGGGCG CTCCGGCGCC GCAG CTCGTG GCCGCCCCCG TCGAACCCGA 

4 6401 CCGGACCGAC GACGGCCTGG CACTCGCCAC CCACGTCCTC GACCTCGTAC 

4 6451 AGACCTGGCT CGCCTCGCCC CTGCACGACT CCCGCCTGGT CCTGGTGACG 

4 6501 CGAGGGGCAG TGACGGATGC GGATGTGGAT GTGGCTGCCG CGGCCGTTTG 

4 6551 GGGTCTGGTA CGCAGCGCCC AGTCGGAGCA CCCCGGCCGC TTCACGCTGA 

4 6601 TCGACCTCGG CCCCGACG AC ACGCTTGCCG CAGCCATGCA GGCGGCGCAC 

466 51 CTGGAAGAGC CGCAACTGGC GGTGCACGGC GGCGAGATAC GAGTGCCGCG 

46701 ACTGGTCCGC GCCACGACCG ACCCGACCGC CCCGAACGGG AC ACCGG AGG 
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4 6751 CCGACCGGAC GG CGGACCCG TCCGAAGGAC TCCACCGGAA CGGTACGGTT 

4 6801 CTCATCACCG GCGGCACCGG CGTACTCGGC CGACTGGTGG CCGAACACCT 

46851 GGTCACGGAG TGGGG CGTAC GCCACCTGCT GCTCGCGAGC CGACGCGGCG 

4 6 901 ACCAGGCGCC GGGTAGCGCC GAACTCCGCG CCCGCCTGAG CGAATTGGGA 

4 6 951 GCATCGGTCG AGATCGCCCC GGCCGATGTC GGCGACGCGG AAGCGGTCGC 

4 7001 CGCACTGATC GCGTCGGTCG ACCCGGCGCA CCCGCTCACC GGTGTGATCC 

4 7051 ACGCGGCCGG TGTCCTGGAC GACGCCGTGA TCACCGCCCA GACCCCCGAG 

4 7101 AGCCTCGCGC GGG TGTGGGC GACGAAGGCG ACGGCGGCCC GCCATCTGCA 

4 7151 CGAGGCGACA CGGGAGACAC CCCTCGACTT CTTCGTGGTG TTCTCCTCGG 

4 7201 CGGCCGCCTC GCTCGGCAGC CCCGGCCAGG CCAACTACGC GGCGGCCAAC 

47251 GCCTATTGCG ACGCCCTCGT CCAGCACCGC CGCGCCCAAG GGCTCGCGGG 

4 7301 CCTCTCGATC GCCTGGGGCC TGTGGCAGGC GACCAGCGGC ATGACCGGGC 

47351 AGCTGAGCGA GACCGACCTG GCGCGCATGA AGCGCACCGG GTTCGCCGCG 

4 7401 CTGACCGACG * AGGGCGGCCT GGCCCTGCTC GACGCCGCCC GTGCCCACGA 

474 51 CCGGGCCTAC GTGGTCGCGG CCGACCTCGA CCCGCGCGCC GTGACCGATG 

4 7501 GCCTGTCCCC GCTCCTGCGC GCCCTCACGG CGCCCGCCAG GCGGCGGCGC 

47 551 GTGGCCTCCG AAGGCCTCG C CGACGGGGCG CTCGCGACCC GCCTGGCCGG 

4 7 601 CCTCGACGCG GACGGCCGCC TAAGGCTCCT CACCGATGTC GTACGCGAGT 

4 7651 ACGTCGCGGC CGTCCTCGGC CATGGTTCCG CCGCCCGGGT GG G CGTCG AC 

4 7701 ATCGCCTTCA AGGACCTGGG TTTCGACTCG CTGACCGCGG TGGAGCTGCG 

4 7751 CAACCGGCTG TCGGCCGCCT GTGACGTGCG GCTGCCCGCC ACACTGATCT 

47801 TCGACCACCC CACCCCGCAG GCTCTCGCCA CCCACCTGGT GGACCGCTTG 

4 7 851 GCGGGCAGCA CCTCCGCGAC CACGACGGTG AATGCGACGG CGCCGGCAGC 

4 7 901 CGCCCACGTC GCCGCAGGGG CCGACGTCGA CGCAGACACC GACGACCCGG 

47 951 TCGCCATCGT CGCCATGACG TGCCGGTTCC CGGGCGGCGT CGCGTCCCCG 

4 8001 GACGACCTGT GGGACCTGCT CGACGCACGC AAGGACGCGA TGGGCGCCTT 
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4 8051 CCCCACCGAC CGCGGCTGGG ACCTGGAACG CCTCTTCCAC CCCGACCCGG 

4 8101 ACCACCCCGG CACCAGCTAC ACCGACCAGG GCGGATTTCT TCCCGACGCG 

4 8151 GGTGATTTCG ATGCGGCGTT CTTCGGGATC AATCCGCGGG AGGCGCTGGC 

4 8201 GATGGATCCG CAGCAGCGGT TGTTGCTGGA GGCGTCGTGG GAGGTGTTGG 

4 8251 AGCGTGCGGG TATCGATCCG ACGACGCTCA AGGGCACCCC GACCGGCACC 

4 8 301 TACGTGGGCC TCATGTACCA CGACTACGCC AAGTCCTTCC CCACGGCCGA 

4 8351 CGCCCAGTTG GAGGGCTACT CCTACTTGGC GAGCACCGGC AG CATGGTCT 

4 8401 CCGGCCGCGT CGCCTACACC CTGGGCCTTG AAGGTCGGGC GGTGACGGTC 

4 8451 GACACCGCGT GCTCCTCCTC CCTGGTCTCC ATCCACCTGG CGACGCAGGC 

4 8501 ACTCCGGCAC GGCGAGTGCG ACCTCGCCCT GGCAGGCGGT GTGACCGTCA 

4 8 551 TGGCCGACCC GGACATGTTC GCGGGCTTCT CGCGCCAGCG CGGCCTCTCA 

4 8601 CCTGACGGCC GCTGCAAGGC CTACGCCGCC GCGGCCGACG GAGTCGGATT 

4 8651 CTCCGAGGGA GTGGGCGTAT TGCTCCTTGA GCGGTTGTCG GATGCGCGGC 

4 8701 GTCATGGGCG TCGGGTGTTG GGTGTGGTGC GGGGTTCGGC GGTGAATCAG 

4 8751 GACGGTGCGA GTAATGGGTT GACGGCGCCG AATGGTCCGT CGCAGGAGCG 

4 8801 GGTGATTCGT CAGGCGTTGG CCAGTGGTGG GTTGTCGTCG GTGGATGTTG 

4 8851 ATGTGGTGGA GGGGCATGGG ACGGGGACCA CGTTGGGTGA TCCGATCGAG 

4 8 901 GCGCAGGCTC TGCTGGCCAC ATATGGG C AG GGGCGTCCGG AGGACCGTCC 

4 8 951 GTTGTGG TTG GGGTCGGTGA AG TCGAAC AT TGGTCATACG CAGGCGGCTG 

4 9001 CGGGTGTTGC GGGTGTCATC AAGATGGTGA TGGCGATGCG GCATGGTGTG 

4 9051 GTGCCGGCGA GTTTGCATGT GGATGTGCCG TCGCCGCATG TGG AG TGGG A 

4 9101 TTCGGGTGCG GTGCGGTTGG CGGTTGAGTC GGTGCCATGG CCGCAGGTGG 

4 9151 AGGGTGGTCC GCGTCGGGCG GGTGTGTCGT CGTTCGGCGC TTCGGGGACG 

4 9201 AATGCGCACG TGATCGTGGA GTCTGTTCCC GATGGGCTGG AGGAGGACTC 

4 9251 GGTATCGGTC GGCGGTGAGG CTCTTGAGAC GGAGACTGAC GGGCGCTTGG 

4 93 01 TGCCGTGGGT OGTGTCGGCC CGCAGCCCGC AGGCCCTGCG CGACCAGGCA 
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4 9351 CTACGCCTGC GTGACTTTGC CAGTGACGCG TCGTTCCGCG CGCCGCTCGC 

4 9401 CGACGTGGGC TGGTCGCTGC TGAAGACGCG TGCGCTGCAT GAGCATCGCG 

4 9451 CCGTTGTGGT GGGCGCGGAG CGGGCAGAGC TGATCGCCGC TCTGGAGGCG 

4 9501 GTGGCGACGG GTGAGCCGCA TGCGGCGCTG GTCGGCCCGG CTTGCTCGCA 

4 9551 GGCTCGGGTG GGTGGCGATG ACGTGGTGTG GCTGTTCAGT GGTCAGGGCA 

4 9601 GTCAGTTGGT CGGTATGGGT GCTGGTTTGT ATGAGCGGTT CCCGGTGTTT 

4 9651 GCGGCTGCGT TTGATGAGGT GTGCGGCCTG TTGGAGGGGC CGTTGGGCGT 

4 9 701 GGAGGCGGGT GGGTTGCGGG AGGTGGTGTT CCGTGGCCCG CGGGAGCGGT 

4 9751 TGGATCACAC GGTGTGGGCG CAGGCGGGGT TGTTTGCGCT GCAGGTGGGG 

4 9801 TTGGCCCGGT TGTGGGAGTC GGTCGGGGTG CGGCCGGATG TGGTGCTCGG 

4 9851 GCATTCGATC GGTGAGATCG CGGCCGCGCA TGTGGCGGGG GTTTTTGATC 

4 9901 TGGCGGATGC GTGTCGGGTG GTGGGTGCGC GGGCGCGTTT GATGGGTGGG 

4 9951 CTGCC TGAGG GTGGGGCGAT GTGCGCGGTG CAGGCCACGC CCGCCGAGCT 
500 01 GGCCGCCGAC GTGGACGGAT CGGCTGTAAG TGTGGCGGCA GTCAACACCC 

5 0051 CCGACTCCAC GGTGATTTCG GGCCCGTCGG ACGAGGTGGA CCGGATTGCT 
50101 GGGGTGTGGC GGGAGCGTGG GCGCAAGACG AAGGCGCTGA GCGTCAGTCA 
50151 TGCCTTCCAT TCGGCGTTGA TGGAGCCGAT GCTCGCGG AG " TTCACCGAAG 
50201 CGATACGAGG GGTCAAGTTC AGGCAGCCGT CGATCCCGCT CATGAGCAAT 

502 51 GTCTCCGGAG AGCGGGCCGG CGAGGAGATC ACGGATCCGG AGTACTGGGC 

503 01 GAGGCATGTA CGTAATGCGG TGCTCTTCCA GCCCGCCATC GCCCAAGTAG 
5 0351 CGGATTCAGC GGGCGTGTTT GTGGAGCTCG GCCCCGCGCC TGTGCTGACC 
5 0401 ACGGCCGCCC AGCACACCCT GGACGAGTCG GACAGCCAGG AGTCGGTGCT 
5 0451 GGTCGCGTCT CTCG CCGGTG AGCGTCCTGA GGAGTCGGCG TTTGTGGAGG 
505 01 CGATGGCTCG TCTGCATACC GCTGGTGTTG CTGTGGACTG G TCGG TG T TG 
5 0551 TTCGCGGGTG ATCGTGTGCC TGGGCTGG TG GAGTTGCCGA CGTATGCGTT 
5 0601 CCAGCGGGAG CGGTTCTGGT TGAGTGGCCG TTCTGGGGGT GGGGATGCGG 
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50651 CGACTTTGGG GTTGGTGGCG GCGGGGCATC CGTTGTTGGG TGCGGCGGTG 

50701 GAGTTCGCGG ACCGGGGTGG GTGTCTGCTG ACCGGTCGTC TGTCGCGGTC 

507 51 TGGGGTGTCG TGGCTTGCTG AT C ATGTGGT GGCGGGTGCG GTTTTGGTGC 

50801 CGGGTGCTGC GTTGGTGGAG TGGGCGTTGC GGGCCGGTGA TGAGGTCGGT 

50851 TGTGTGACGG TGGAGGAGTT GATGTTGCAG GCGCCTTTGG TGGTGCCTGA 

50901 GGCGTCGGGT CTGCGGGTTC AGGTGG TGG T TGAGGAGGCG GGTGAGGAtzG 

50951 GGCGGCGCGG TGTTCAGATC TACAGCCGGC CCGACGCGGA CGCCGTGGGC 

51001 GGCGATGACT CGTGGATCTG CCACGCGACC GGCGTACTGT CACCCGAAAG 

51051 CGCTCGTCTG GACACGGAGT TGGGTGGCGT CTGGCCACCG GCCGGTGCCG 

51101 AACCGCTGGA TGTCGACGGC TTCTACGCGC AGGCCGGTGA GGCCGGGTAC 

51151 GGATACGGTC CGGCGTTCCG GGGGCTGCGT GCCGTGTGGC GGCACGGCCA 

51201 GGACCTGCTG GCCGAGGTCG TCCTGCGCGA AGCCGCCGGT GCCCATGACG 

51251 G CTACGGG AT CCACCCCGCC CTCCTCGACG CCACCCTCCA TCCGCTGCTC 

51301 GCCGCCCGCT TCATGGACGG TTCCGAGGAC GATCAGCTCT ACGTACCGTT 

513 51 CGGGTGGGCC GGAGTGTCTC TGCGGGCGGT GGG AG CCACG ACTGTG CGCG 

514 01 TGCGCCTCCG TCCGGTCGGG GAGAGCGTCG ACCAAGGGCT GAGCGTGACG 
514 51 GTCACCGATG CGACCGGCGG TCCCGTTCTG AGCGTCGACT CCCTCCAGAC 
51501 CCGCCCCGTG AAGCCGAGCC AATTGGCTGC GGCCCAACAG CCGGACGTAC 
51551 GCGGTCTGTT CAC TGTGG AG TGGACGCCGC TGCCGCAGAC GGATGCCGAC 
51601 GGGGAGGCCG ACTGGGTTGT GCTCTCGGAC GGTGTTGGCC GTCTGG CTGA 
51651 TGTGGTGTCG GCGGCGGGTG GTGAAGCGCC GTGGGCAGTG GTCGCTCCTG 
51701 TCGATGCGTC TGTGG GCG AC GGCCGTGAGG GTCTTGACGG TCGGCTGGTC 
51751 GTGGAGCGGG TGCTG TCACT CGTACAGGAG TTCCTGGCCC TGCCGGAGCT 
51801 GGCCGAGTCC CGTCTCCTCG TGGTGACGCG CGGTGCGGTG GCCACCGGCG 
51851 TCGACGGTGA CGGTG ACGTG G ACGCG TCCG CCGCAGCTGT ATGGGGCCTG 
51901 GTCCGCAGTG CTCAGTCCGA GAATCCGGGC CGCTTCATCC TGCTCGACGT 

-40- 



SUBSTITUTE SHEET (RULE 26) 



51951 GGACGGCGAC GGCGACGACC AGGGCCCGGA CCTGAACGGC CGGCATCTGC 

52 001 CCCACGCCAC CCTGCGTCAC GCCGCCGhGG AACTCGACGA GCCCCAACTC 

52 051 GCCCTGCGGG AAGGGACGGT CTACGTCCCC CGACTGACCC AGGCGCGCCA 

52101 GTCCGCCGAA CTCGTCGTGC CGCCCGGTGA ACCGGCGTGG CGCCTGCGGA 

52151 TGGTGCACGA CGGCTCGCTG GACGCCCTGG CGGCAGTGGC CTGCCCGGAG 

52201 GCCCTGGAGC CCTTGGCGCC GGGGCAGGTG GGTAT CGCCG TACACGCCGC 

522 51 GGGCATCAAC TTCCGTGACG TACTGGTGGC CTTGGGTATG GTCCCCGCGT 
52301 AGGGGGC CAT GGGTGGCGAA GGTG CCGGTG TCGTGACGGA GGTCGGTCCC 

523 51 GAGGTCACCC ATGTCTCGGT GGGCGACCGC GTGATGGGCG TGTTCGAGGG 

524 01 CGCGTTCGGC CCTGTGGTGA TCGCCGAGGC GCGGATGGTC ACACCTGTCC 
52451 CGCAGGGCTG GGACATGCGG GAGGCGGCCG GTATTCCGGC GGCCTTCCTG 
52501 ACGGCTTGGT ACGGGTTGGT GGAGCTGGCC GGTCTGAAGG CGGGCGAGCG 
52551 GGTGCTGGTC CATGCCGCGA CGGGTGGTGT GGGGATGGCG GCGGTGCAGA 
5 2601 TCGCCCGGCA TGTGGGTGCC GAGGTGTTCG CCACCGCGAG TCCGGGCAAG 
52651 CACGCCGTGC TGGAGGAGAT GGGCATCGAC GCCGCCCACC GCGCCTCCTC 
52701 CCGGGACCTC GCCTTCGAGG GCACGTTCAG GGAAGCAACG GGCGGCCGCG 
527 51 GCATGGACGT CGTGC TCAAC AGCCTTGCCG GCGAGTTCAT CGACGCCTCT 
52 801 CTGCGGTTGC TCGGCGACGG CGGCCGGTTC CTGGAGATGG GCAAGACCGA 
52851 TGTGCGGGCC GCCGAAGAGG TGGCTGCGGA GCACGCGGAC GTCTCGTACA 

52 901 CGGCGTACGA CCTCGTCGGT GATGCCGGAC CCGACCGCAT CAGCAACATG 
52951 CTGGACAAGC TCGTCGAATT GTTCGCCTCA GAACGGCTTA AGCCGCTGCC 

53 001 GGTACGTTCC TGGCCGCTGG ACAAGGCGCA GGAGGCGTTC CGGTTCATGA 
53 051 GTCAGGCGAA GCACACCGGC AAGCTGGTGC TTGAGATCCC GCCTGCCCTC 
53101 GACCCCGAGG GCACGGTTCT GGTCACCGGG GGCACCGGTG CGCTGGGGCA 
53151 GGTCGTGGCC GAGCATCTGG TCCGGGAGTG GGGCGTACGG CACCTGCTGC 
532 01 TGGCCAGCCG TCGCGG TCCG GAGGCGCCGG GCAGCGACGA ACTGGCCTCG 
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53 251 AAGCTCACCG GGTTGGGTGC CGAGGTCACC ATTGTCGCGG CCGATGTCAG 

5 3 301 CGACCCGGCC TCGGTGGTGG AGCTGGTCGG CAAGACGGAT CCCTCGCATC 

5 3 351 CGTTGACGGG TGTCGTGCAC GCGGCGGGCG TGTTGGAGGA CGGTGTCGTG 

534 01 ACCGCTCAGA CGCCTGAGGG GCTGGCGCGG GTGTGGGCGG CCAAGGCTGC 

53451 TGCGGCGGCG AATCTCCATG AGGCGACCCG GGAGATGCGT CTCGGCCTGT 

5 3 501 TCGTGGTGTT CTCCTCGGCG GCCGCCACGC TCGGCAGTCC GGGCCAGG*CC 

5 3 551 AACTACGCGG CCGCCAATGC CTATTGCGAC GCGCTGATGC AGCACCGACG 

53 601 GGCGGTGGGC CAGGTCGGCC TGTCGGTCGG CTGGGGTCTC TGGGAGGCGC 

53 651 CGGACGCCAA GCCGGGTGTT GCCGCCGACG CCAAGGCGAG TGCTGCCACC 

53 701 GTCGGCAAGG CGAGTGCTCT ATCCGACGGC ACGAACGGCA GCGCTCCCCA 

5 3751 GGACACGACC GGCACCGCCC CCCAGGGCAT GACCGGCGGA CTCACCGACA 

53 801 CCGACGTAGC CCGCATGGC A CGTATCGGCG TCAAGGGCAT GAG CAACG C C 

5 3 851 CACGGTCTCG CCCTGTTCGA CGCCGCGCAC CGCCACGGCC GCCCCCACCT 

53 901 GGTCGG CTTG AACCTCGACC TGCGCACCCT GGCCAGGCAC CCCCTGCACA 
5 3 951 CCCGGCCCGC CCTTCTGCGC GGCCTGGCCA CCCCCACCGC CGGCGGGGCG 

54 001 AGCAGGCCGA CCGCGACGGC GGGCGGACAG CCCGCCGACC TGGCGGGCCG 
54 051 GCTGGCCGCG CTGTCGCCGT CGGACCGGCA CCACACGCTG GTCCGGCTCA 
54101 TC AGGGAA CA GGCCGCCACC GTG CTCGGGC ACCACCCGGA CAGTCTCACC 
54151 ACGGGCAGCA CCTTCAAGGA ACTCGGATTC GACTCCCTGA CCGCGGTCGA 
54201 ACTGCGCAAC AGGCTGTCCG CCGCCACCGG TCTCCGGCTC CCCGCCGGCC 

542 51 TGGTCTTCGA CCACCCGGAC GCCGACATCC TGGCCGAACA CCTCGGCGCG 

543 01 CAACTCGCCC CCGACGGGGA CACCCCCGCC GGTG CGGAAG CCACCGACCC 
54351 GGTCCTCCGC GACCTGGCGA AACTCGAGAA CGCCCTCTCC TCCACCCTCG 

544 01 TCGAGCACCT CGACGCCGAC GCGGTCACGG CCCGACTGGA AGCACTCCTG 
544 51 TCGAACTGGA AGGCGGGGAG CGCGGCGCCC GG CTCGGGC A GCACGAAGGA 
54501 GCAGCTCCAG GTTGCCACGA CCGACCAGGT CCTCGACTTC ATCGACAAAG 
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54551 AACTGGGTGT GTGAAACGAC CGTGCACGGC GCGACAACCA CGCTGAAGGC 

54601 TGGGTGAACT CTCATGGCGA GTGAAGAGGA ACTGGTCGAC TACCTCAAGC 

54 651 GGGTCGCCGC CGAACTGCAC GACACCCGGC AGCGCCTGCG CGAGGTCGAG 

54 7 01 GACCGGCGGC AGGAGCCGGT GGCCGTCGTC GGCATGGCCT GCCGTTTCCC 

54 751 CGGCGGC AT C GAGACGCCCG AGGGACTGTG GGAGCTGGTC GCGGCCGGCG 

54 801 ACGACGCCAT TGAGCCCTTC CCCACCGACC GGGGCTGGGA CCTGGAACJfcC 

54 851 ATCTACCACC CGGACCCCGA CCACCCGGGT ACCTGCTACG TCCGGGAGGG 

54 901 CGGGTTCCTA GCCGCCCCTG ACCGGTTCGA CTCCGACTTC TTCGG CTTC A 

54 951 GCCCGCGCGA GGCCCTGGCC AGCAGCCCGC AACTG C G AC T GCTCCTGGAG 

55 001 ACGTCCTGGG AGGCCCTCGA ACGGGCGGGC ATCAACCCCG CCTCGCTCAA 
55051 GGGCAGCCCC ACCGGCGTCT ACGTCGGCGC CGCGACCACC GGCAACCAGA 
55101 CGCAGGGCGA CCCCGGCGGC AAGGCGACCG AGGGTTACGC GGGCACCGCG 
55151 CCCAGCGTCC TCTCGGGCCG CCTCTCGTTC ACGCTCGGCC TGGAGGGCCC 
552 01 GGCGGTGACC GTCGAGACAG CGTGCTCCTC CTCG CTGGTG GCGATGCACC 

552 51 TGGCGGCCAA CGCCCTGCGC CAGGGCGAGT GCGACCTCGC CCTCGCGGGC 
55 3 01 GGCGTC AC CG TCATGTCCAC CCCCGAGGTG TTCACAGGCT TCTCGCGTCA 

553 51 GCGGGGACTG GCCCCCGACG GCCGCTGCAA GCCGTTCGCC GCCGCGGCCG 
55401 ACGGCACGGG CTGGGGCGAG GGCGCGGGCC TGATCCTCCT GGAGCGCCTC 

554 51 TCCGACGCCC GCAGGAAGGG CCACAAGGTC CTCGCGGTGA TCCGGGGCTC 
55501 GG CG AT CAAC CAGGACGGOG CG AG CAACGG CTTCACCGCG CCCAACGGCC 
55 551 CCTCGCAGCG CCGCGTCATC CGCCAGGCAC TCTCCAGCGC CCACCTCTCC 
55601 ACGTCGGAGA TCGACGTCGT CGAGGCGCAC GGCACCGGCA CCAGGCTCGG 
55651 CGACCCCATC GAGGCCGAGG CGCTCATCGC CACCTACGGC AAGGAGCGCG 
557 01 AGGACGACCG TCCCCTGTGG CTCGGCTCGG TCAAGTCCAA CATCGGCCAC 
55751 ACGCAGGCCG CCGCGGGCGT CGCCGGAGTC ATCAAGATGG TGATGGCGCT 
55801 ACAGCGCGAA CTGCTTCCCG CCACCCTGAA CGTCGACGAG CCGACCCCGC 
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5 5851 ACGTCCAGTG GGAGGGCGGC GGCGTACGCC TCCTGACCGA ACCGGTCCCG 

55 901 TGGTCGCGCG GCGAACGCCC GCGCCGCGCC GGAATCTCCT CCTTCGGCAT 

55 951 ATCGGGCACG AACGCGCACG TGGTCCTGGA GGAGGCGCCG CCGGAGGAGG 

56001 ACGTG CCGGG CCCCGTGGCT GCGGAGCCGG AAGGGGTGGT GCCGTGGGTG 

5 6 051 GTCTCCGCGC GGACCGAGGA GGCGTTGAGC GAACAGGCGC GGCGCCTGGG 

56101 CGAGTTCGTG GCCGACACGG ACCCGTCGAC CGCTGACGTC GGGTGGTCAC 

56151 TGACCACGAG CAGGGCGATC CTTGAACACC GCGCTGTGGT GGTGGGGCGT 

56201 GATCGGGATG CGCTGACGGC CGGCCTGGCG GCGTTGGCCG CGGGTGAGGA 

5 6251 GTCGGCGGAT GTGGTGGCTG GGGTGGCCGG TGATGTGGGT CCTGGGCCGG 

563 01 TGTTGGTGTT TCCGGGGCAG GGGTCGCAGT GGGTGGG CAT GGGCGCCCAG 

563 51 CTCCTTGACG AGTCGCCCGT CTTCGCGGCG CGGATCGCGG AGTGTG AG C A 

56401 GGCGCTGTCG GCGTACGTGG ACTGGTCGCT GAGTGCGGTG TTGCGCGGGG 

5 6451 ATGGGAGTGA ACTGTCCCGG GTCGAGGTCG TGCAGCCGGT GTTGTGGGCG 

565 01 GTGATGGTCT CGCTGGCTGC CGTCTGGGCG GATTACGGGG TCACCCCGGC 

5 6551 CGCTGTGATC GGGCACTCGC AGGGCGAGAT GGCCGCCGCG TGCGTGGCGG 

5 6601 GGGCGCTGTC TTTGGAGGAT GCGGCGCGCG TCGTGGCCGT ACGCAGTGAC 

5 6651 GCGCTTCGTC AGCTGATGGG GCAGGGCGAC ATGGCGTCGT TGGGCGCCAG 

56701 CTCGG AG C AG GCGGCTGAGC TCATCGGTGA TCGGCCGGGC GTATGCATCG 

567 51 CAGCGGTCAA CGGGCCGTCC TCGACAGTCA TTTCAGGACC G CCGG AG CAT 
56801 GTGGCAGCCG TGGTCGCGGA TGCGGAGGAA CGTGGTCTGC GCGCCCGTGT 

568 51 CATCGATGTC GGCTATGCCT CGCACGGTCC CCAGATCGAT CAGCTCCACG 
5 6 901 ACCTCCTCAC CGACCGGCTC GCCGACATCC GGCCCGCGAC CACGGACGTG 
56951 GCCTTCTATT CGACGGTCAC CGCCGAGCGC CTGACGGACA CCACGGCCCT 
5 7001 GGATACGGAT TACTGGGTTA CCAACCTCCG CCAGCCGGTC CGTTTCGCCG 
57 051 ACACCATCGA TGCGCTTCTC GCGGACGGCT ATCGCCTGTT CATCGAGGCC 
5 7101 AGCGCGCACC CGGTGCTGGG TCTGGGCATG GAGGAGACCA TCGAGCAGGC 
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57151 GGACATCCCC GCCACGGTCG TCCCCACCCT GCGCCGCGAT CACGGTGACA 

57 2 01 CCACCCAGCT CACCCGTGCC GCAGCGCACG CCTTCACCGC CGGCGCCACC 

572 51 GTCGACTGGC GGCGCTGGTT CCCGGCCGAC CCCACCCCCC GCACGATCGA 

573 01 CCTGCCCACC TACGCCTTCC AGCGCCGCAG CTACTGGTTG CCGGTGGACG 
57 3 51 GTGTCGGAGA TGTG CGGTCG GCCGCGCTGC GGCGGGTGGA ACACTCGCTG 

574 01 TTGCCCGCGG CGCTCGGTCT CGCCGATGGT GCGCTCGTGC TGACCGGAtG 
574 51 GCTCGCGGCG TCCGGTGGTG GTGGCGGTTG GCTCGCGGAT CACGCGGTGG 
5 7501 CGGGCACGAC GCTCGTCCCC GGTGCCGCGC TGGTCGAGTG GGCGTTGCGG 
57 551 GCCGCCGACG AGGCGGGCTG CCCCTCCCTT GAGGAGCTGA CGCTCCAGGC 
5 7601 ACCTCTGGTG CTGCCCGGCT CCGGGGGCCT CCAGGTCCAA GTGGTCGTGG 
57651 GTCCGGCCGA CGGACAGGGC GGCCGGCGTG AGGTGCGCGT CTTCTCGCGT 
57701 GTCGACTCGG ACGACGAGGC AG CGGGGCAG GACGAGGGGT GGTCGTGTCA 
57751 CGCGACCGGT GTGCTGAGCC CCGAGCCCGG TGCGGTACCG GACGGGCTCA 
57801 GCGGACAGTG GCCCCCGACG GGCGCCGAGC CGCTGGAGAT CAGTGATCTC 
5 7851 TACGAGCAGG CGGCATCGGC GGGATACGAG TACGGG CCGT CGTTCCGGGG 
57 901 CCTGCGCTCC GTGTGGCGGC ACGGGCATAA CCTGCTGGCA GAGGTGGAGC 
5 7 951 TGCCCGAACA GGCAGGTGCG CACGACGACT TCGGCATCCA CCCCGTACTG 
58001 CTGGACGCCG CGCTGCACCC GGCGCTGCTG CTCGACCAGA ACGCGCCCGG 
5 8 051 CGAAGAGCAA GAGCCAGCCC AGCCCGCTCT TCGCCTGCCG TTCGTGTGGA 
5 8101 ACGGGGTCTC CCTGTGGGCC ACCGGCGCCG CG ACCGTGCG GGTACGGCTG 
5 8151 GCCCCGCACG GGGGAGGGGA GACGGACGAT AGCGCCGGGC TGCGCGTGAC 
58201 GGTCGCCGAC G CCACCGG AG CACCGGTGCT GAGCGTGGAC TCCCTCGCTC 
58251 TGCGCCCCGC TGACCCCGAA CTGCTGCGCA CGGCCGGTCG GGCGGGCAGC 
583 01 GGCACCAACG GCTTGTTCAC GGTGGAGTGG ACCGCTCTGC CCCCGGCGGA 
583 51 CGTGGCCGAC CACGCCGCAG GCGACGGCTG GGCGGTGCTC GGTCAGGACG 
58401 TACCCGACTG GGCCGGAGCG GACATGCCCC GGCATCCCGA CATGGCCTCC 
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58451 CTGTCGGCCG CGCTGGACGA GGGAACGCAG GCCCCTGCGG CCGTCTTCGT . 

5 8501 GGAGACCACA GCCACATCGC ACGCCACACC GAACACCGCA GCGGACGTGA 

58551 CGCTCGACGC GTCCGGCCGG GCGGTCGCCG AGCGCACCCT GCACCTGCTG 

5 8601 CGGGACTGGC TCGCCGAACC GCGCCTCGCC GAG AC C CGG C TCGTCCTCAT 

5 8651 CACCCACCAC GCGGTGACGA CCCCGGCGGA CGACGACGTG AACGCCGCAC 

5 8701 CCCTCGACGT CCCGGCCGCC GCCCTGTGGG GACTGATCCG CAGCGCACAG 

5 8751 GCCGAACACC CGGACCGCTT CGTTCTGTTG GACACCGACG CGAAGGCCAA 

5 8801 CACCGACCCC GGCCCCGACA CGAGTACTGA CCACAGCACC GCATCGGGTA 

58 851 CGTACCGAAC CGTCATCGCG CGGGCCCTCG CCACCGGGGA GCCACAG C TG 

58901 G CCGTG CGCG CGGGAGAACT GCTGGCTCCC CGCCTCGCCC GAGCCGCCAC 

5 8 951 CCCCACACCC GAGACCCCCA CACCCGAGAC ACAGCCCGAC ACCGGATCCG 

59001 GGTCCGAGGC CGGGGCCGGG TCCGGATCTG GACCCGGCGC GACACTGGAC 

5 9051 CCCGACGGCA CCGTCCTCAT CGCGGGCGGC ACCGGCATGA TGGGTGGTCT 

5 9101 CGTCGCCGAA CACCTGGTCC GCGCCTGGTC GGTGCGGCAC CTCCTGCTCG 

59151 TCAGCCGGCA AGGGCCCGAC GCGCCGGACG CCCGCGACCT CGCCGACCGG 

59201 CTGGTCGGCC TGGG CGCG AC GGTACGGATC G TCG CGGCCG ACCTGACGGA 

5 9251 CGGGCGGGCC ACCGCGGACC TCGTCGCGTC GGTCGACCCG GCGCACCCGC 

5 9301 TCACCGGTGT GATCCACGCG GCCGGCGTCC TGGACGACGC CGTGGTCACC 

5 9351 GCGCAGACCT CCGACCAGCT GGCCAGGGTG TGGGCGGCCA AGGCGTCCGT 

5 9401 CGCCGCCAAC CTGGACGCGG CCACGTCGGA GCTGCCGCTC GGCTTGTTCC 

5 9451 TGATGTTCTC GTGCGCCGCC GGTGTCCTCG GCAACGCGGG CCAGGCCGGT 

5 9501 TACGCGGCCG CCAACGCCTT CGTCGACGCC CTGGTCGGCC GCCGTCGCGC 

59 551 CACCGGCCTG CCCGGCCTGT CGATCGCCTG GGGCCTGTGG GCGCGCGGCA 

59601 GCGCCATGAC CCGGCACCTG GACGACGCCG ACCTCGCGCG GCTGCGTGCC 

5 9651 GGCGGGGTCA AGCCCCTGCT GGACGAGCAG GGCCTCGCCC TCCTCGACGC 

59701 GGCGCGCGCC ACCGCCGCGC ACACCTCGCT GGTGGTCGCG GCCGGTATCG 
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59751 ACGTACGCGG ACTGAACAGG G ACG ACGTC C CCGCGATCCT CCGCGACCTG 

5 9801 GCGGGCCGGA CCCGCCGCAG GGCGGCCGCC GACTCCACCG TCGACCAGGC 

59851 CGCGCTGGAG CGGCGCCTCA CGGGCCTGGA CGAGGCCGAG CGCCGGGCTG 

59901 TCGTCACCGA CGTCGTACGC GAATGCGTGG CGGCCGTGCT CGGCCACCGG 

5 9951 TCGGCGGCCG ACGTACGCAC CGAGGCCAAC TTCAAGGACC TCGGCTTCGA 
60001 CTCGCTCACT GCGGTGCAGC TGCGCAACCG CCTCTCGGCG GCGAGCGGCC 
600 51 TCCGCCTGCC CGCCACCCTG GCCTTCGACC ACCCCACCCC CCAGGCGCTG 
60101 GCGGCGTACC TCGGCACGCG CCTGAGCGGC CGGACCGCCA CCCCCGTCGC 
60151 ACCCGTGGCG CCTTCCGCGG CCGCGACGGA CGAGCCGGTG GCGATCGTCG 

6 0201 CGATGGCCTG CAAGTACCCG GGTGGAGCGA CCTCGCCGGA AGGCCTCTGG 
60251 GACCTGGTCG CGGAGGGCGT GGACGCGGTC GGCGCCTTCC CGACGGGCCG 
603 01 CGGCTGGGAC CTCGAACGGC TCTTCCACCC CGACCCGGAC CACCCCGGCA 
60351 CGAGTTACGC CGACG AAGGG GCCTTCCTTC CTGACGCGGG CG ATTTCGAT 
6 0401 GCGGCGTTCT TCGGGATCAA TCCGCGGGAG GCGCTGGCGA TGGATCCGCA 
6 0451 GCAGCGGCTG TTGCTGGAGG CGTCGTGGGA GG TGTTGG AG CGTGCGGGTA 
6 0501 TCGACCCGAC G ACG CTCAAG GGCACCCCGA CGGGCACGTA CGTCGGCGTG 
60551 ATGTACCACG ACT ACGCGG C AGGCCTCGCC CAGGACGCCC AACTGGAGGG 
6 0601 CTACTCCATG CTCGCCGGCT CCGGCAGCGT GGTGTCCGGC CGCGTCGCCT 
6 0651 ACACCCTGGG GCTTGAGGGT CCTGCGGTGA CGGTCGACAC CGCGTGCTCC 
60701 TCGTCCCTGG TCTCCATCCA CCTGGCCGCG CAAGCACTGC GACAGGGCGA 
60751 GTGCACTCTC GCCCTCGCGG GCGGCGTGAC CGTCATGGCC ACGCCCGAGG 
60801 TGTTCACCGG ATTCTCG CGC CAGCGCGGCC TGGCCCCCGA CGGCCGCTGC 
60851 AAGCCGTTCG CCGCCGCCGC CGACGGCACC GGCTGGGGCG AGGGTGTCGG 
60901 TGTGTTGTTG CTCGAGCGGT TGTCGGATGC GCGGCGTCAT GGGCGTCGGG 
60951 TGTTGGGTGT GGTGCGGGGT TCGGCGGTGA ATCAGGACGG TGCGAGTAAT 
61001 GGGTTGACGG CGCCGAATGG TCCGTCCCAG GAG CGGG TG A TTCG TCAGGC 
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61051 GTTGGCCAGT GGTGGGTTGT CGTCGGTGGA TGTTGATGTG GTGGAGGGGC 
61101 ATGGG ACGGG GACCACGTTG GGTGATCCGA TCGAGGCGCA GGCTCTGCTG 
61151 GCCACGTATG GGCAGGGGCG TCCGGTGGAT CGTCCGTTGT GGTTGGGGTC 
612 01 GGTGAAGTCG AATATTGGTC ATACGCAGGC GGCTGCGGGT GTTG CGGGTG 

612 51 TCATCAAGAT GGTGATGGCG ATGCGGCATG GTGTGGTGCC GGCGAGTTTG 

613 01 CATGTGGATG TGCCGTCGCC GCATGTGGAG TGGGATTCGG GTGCGGTGCG 

613 51 GTTGGCGGTT GAGTCGGTGC CATGGCCGGA GGTGGAGGGT CGTCCGCGTC 

j 

614 01 GGGCGGGTGT GTCGTCGTTC GGGGCTTCGG GAACGAATGC GCACGTGATC 
61451 GTGGAGTCTG TGCCCGATGG G CTGGGGG AG GACTCGGTAT CGGTCAGTGG . 

615 01 TGAGGCTCCC GAGACTGAGA CTGACGGGCG CTTGGTGCCG TGGGTGGTAT 
61551 CGGCCCGCAG CCCGCAGGCC CTGCGCGACC AGGCACTACG CCTGCGTGAT 
61601 GCGGTGGCGG CCGACTCAAC GGTGTCGGTG CAGGATGTGG GCTGGTCGCT 
61651 GCTGAAGACG CGTGCGCTGT TCGAGCAGCG GGCGGTGGTG GTGGGGCGTG 
61701 AGAGGGCTGA ACTCCTGTCG GGGCTTG CTG TGTTGGCCGC TGGCGAGGAG 
61751 CACCCGGCTG TGACGCGGTC C CGTGAGG AC GGGGTTGCTG CGAGCGGTGC 
61801 TGTGGTGTGG CTGTTCAGTG GTCAGGGCAG TCAGTTGGTC GG T ATGGG TG 
61851 CTGGTTTGTA TGAGCGGTTC CCGGTGTTTG CGGCTGCGTT TGATGAGGTG 
61901 TGCGGCCTGT TGGAGGGGCC GTTGGGCGTG GAGGCGGGTG GGTTGCGGGA 
61951 GGTGGTGTTC CGTGGCCCGA GGGAGCGGTT GGATCACACG ATGTGGGCGC 
62 001 AGGCGGGGTT GTTTGCGCTG CAGGTGGGGT TGGCCCGGTT GTGGGAGTCG 
62051 GTCGGGGTGC GGCCGGATGT GGTGCTCGGG CATTCGATCG GTGAGATCGC 
62101 GGCCGCGCAT GTGGCGGGGG TCTTTGATCT GGCGGATGCC TGTCGGGTGG 
62151 TGGGGGCGCG GGCCCGTTTG ATGGG TGGGC TGCCTGAGGG CGGGGCGATG 
622 01 TGCGCGGTGC AGGCCACGCC CGCCGAGCTG GCCGCCGACG TGGACGACTC 
62 2 51 TGGTGTGAGT GTGGCGGCGG TCAACACACC TGATTCGACG GTGATTTCAG 
62 3 01 GGCCGTCTGG TGAGGTGGAT CGGATTGCTG GGGTGTGGCG GGAGCGTGGG 
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62351 CGTAAGACGA AGGCGCTGAG CGTCAGTCAT GCCTTCCACT CGGCGTTGAT 

62401 GGAGCCGATG CTCGCGGAGT TCACCGAAGC GATACGAGAG GTCAAGTTCA 

62451 CG CG G C CGAA GGTGTCGTTG ATCAGCAACG TCTCTGGTCT GGAGGCGGGT 

62501 GAGGAGATCG CGTCCCCGGA GTACTGGGCA CGCCATGTAC GCCAGACAGT 

62 5 51 GCTCTTCCAG CCCGGCATCG CCCAAGTGGC TTCCACGGCA GGCGTGTTTG 

62601 TCG AG CTCGG CCCCGGCCCC G TACTG ACTA CTGCCGCCCA GCACACCCTG 

62651 GACGACGTAA CCGATAGGCA TGGCCCCGAA C C GGT AC TGG TGTCCTCGCT 

62701 GGCCGGTGAG CGTCCTGAGG AGTCGGCGTT CGTGGAGGCG ATGGCTCGTC 

62751 TGCATACCGC TGGTGTTGCT GTGGACTGGT CGGTGTTGTT CGCGGGTGAT 

62801 CGTGTGCCTG GGCTGGTGGA GTTGCCGACG TATGCGTTCC AGCGGGAGCG 

62 851 GTTCTGGTTG AGCGGCCGTT CTGGGGGTGG GGATGCGGCG ACTTTGGGTC 

62901 TGGTGGCGGC GGGGCATCCG TTGTTGGGTG CGGCGGTGGA GTTCGCGGAC 

62 951 CGGGGTGGGT GTCTGCTGAC CGGTCGGCTG TCGCGGTCTG GGGTGTCGTG 

63 001 GCTTGCTGAT CATGTGGTGG CGGGTGCGGT TTTGGTGCCG GGTGCTGCGT 
6 3 051 TGGTGGAGTG GGCGTTGCGG GCCGGTGATG AGGTCGGTTG TGTGACGGTG 
63101 GAGGAGTTGA TGTTGCAGGC GCCTTTGGTG GTGCCTGAGG CGTCGGGTCT 
63151 GCGGGTTCAG GTGGTGGTCG AGGAGGCGGG TGAGGACGGG CGGCGCGGTG 
632 01 TCCAGATCTA TAGCCGGCCT GACGCGGACG CCGTGAGCGG CGACGACTCG ■ 
63251 TGGATCTGCC ACGCG AC CGG CACCCTCACC CCCCAGCACA CCGACGCTCC 
63 3 01 GAACGACGGA CTGGCCGGCG CGTGGCCCGC GGCGGGCGCC GTGCCGGTGG 
63351 ACCTGGCGGG CTTCTACGAG CGCGTGGCGG ACGCGGGCTA -fGCGTACGGC 
634 01 CCGGGGTTCC AGGGGCTGCG TGCCGTGTGG CGGCACGGTC AGGACCTGCT 
63451 GGCCGAGGTC GTCCTGCCCG AAGCCGCGGG TGCCCATGAC GGCTACGGCA 
63 501 TCCACGCCGC CCTCCTCGAC GCCACCCTCC ACCCGGCCCT GCTCCTCGAC 
63 551 TGGCCCGGG G AGGTGCAGGA CGACGACGGG AAGGTCTGGC TGCCTTTCAC 
63 601 CTGGAACCAG GTCTCCTTGC GGGCTGCGGG AGCCGCCACC GTACGCGTAC 
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63 651 GTCTCTCGCC CGGCGAGCAC GACGAGGCGG AACGGGAAGT ACAGGTACTG 

637 01 GTGGCCGACG CCACCGGGAC CGACGTCCTG AGCGTGGGGT CGGTGACGTT 

63751 GCGTCCCGCC GACATCCGGC AACTGCAGGC CGTGCCGGGT CACGACGACG 

63801 GTCTGTTCTC GGTGGACTGG ACGCCGCTGC CGCTGTCGCG GACGGATGTG 

63851 TCGCAGACGG ATGCCGACGG GGATGCCGAC TGGGTTGTGC TCTCGGACGG 

63 901 TGTCGGC AG C CTGGCTGATG TGGTGTCGGC GGCGGGTGGT GAAGCGCCGT 

63 951 GGGCAGTGGT CGCTCCCGTC . GGTGCATCCG CGGGCGGCGG CCTTGCCGGC 

64 001 TTTGACCGCC GTGAGGGTCT TGACGGTCGG CTGGTCGTGG AGCGGGTGTT 
64 051 GTCACTCGTA CAGGAGTTCC TGGCCGCGCC GGAGCTGGCC GAGTCCCGGC 
64101 TCCTCGTGCT GACCCGCGGC GCCGTGGCGA CCGGCGGCGA CGGCGACGGT 
64151 GATGTGGACG CGTCCGCCGC AGCCGTATGG GGCCTGGTCC GCAG TGCTCA 
64 201 GTCCGAGAAC CCGGGCCGCT TCATCCTGCT CGACGTGGAC ATGGACGTGG 
64251 ACGTCGACGT GGACATGGAC GTGGACGTCG ACGTGGACGT CGACGTGGAC 

643 01 GTGGACGGAG ACGGCAATGG CAGCGACCTG GACCCGGACC TGAACGGCCG 
64 351 ACGACTTCCC CACGCCACCC TGCGTCACGC CGCCGAGGAA CTCGACGAGC 

644 01 CCCAACTCGC CCTGCGCGAC GGACAACTGC TCGTTCCGCG GCTGGTCCGC 
644 51 GCCACCGGCG GCGGACTCGT CGTGGCGCCC ACCGACCGTG CCTGGCGCCT 
64 501 GGACAAGGGA AGCGCCGAGA CGCTGGAGAG CGTCGCGCCG GTCGCGTACC 
64 551 CCGGAGTCAT GGAACCCCTG GGCCCCGGCC AGGTCCGCCT CGGCATCCAC 
64 601 GCCGCGGGCA TCAACTTCCG CGACGTCCTG GTCAGCCTCG GCATGGTGCC 
64651 CGGCCAGGTC GGCCTGGGCG GCGAAGGCGC CGGTGTCGTG ACGGAGACAG 
64 701 GCCCCGATGT CACCCACCTG TCGGTCGGCG ACCGCGTGAT GGGCGTCCTC 
64 751 CACGGCTCCT TCGGCCCGAC GGCCGTGGCG GACACCCGCA TGGTCGCGCC 
64 8 01 GGTTCCGCAG GGCTGGGACA TGCGGCAGGC GGCCGCGATG CCCGTCGCGT 
64 851 ATCTGACGGC TTGGTACGGG TTGGTGGAGC TGGCCGGTCT GAAGGCGGGC 
64 901 GAGCGCGTGC TGATCCACGC AGCCACGGGT GGTGTGGGAA TGGCGGCGGT 
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64 951 GCAGATCGCC CGTCACCTGG GTGCCGAGGT GTTCGCCACC GCCAGTGCAG 

65 001 CCAAGCACGT CGTACTGGAA GAGATGGGCA TCGACGCCGC CCACCGCGCC 
65 051 TCCTCCCGGG ACCTCGCCTT CGAGGACACC TTCCGGCAGG CCACCGACGG 
65101 GCGCGGCATG GACGTCGTCC TCAACAGCCT GACCGGCGAG TTCATCGACG 
65151 CATCTCTGCG GTTGCTCGGC GACGGCGGCC GGTTCCTGGA GATGGGCAAG 

652 01 ACCGATGTGC GCACGCCGGA GGAGGTGGCC GCGGAGTACC CGGGTGTdAC 
65251 CTACACCGTG TACGACCTCG TCACCGACGC GGGGCCGGAT CGCATCGCGG 

653 01 TCATGATGAG TGAGCTGGGC GAGAGGTTCG CTTCCGGTGC CCTTGACCCT 

653 51 CTGCCGGTGC GTTCCTGGCC GCTGGACAAG GCGCGTGAGG CGTTCCGGTT 

654 01 CATGAGTCAG GCCAAGCACA CCGGCAAACT CGTACTCGAC GTGCCCGCAC 
654 51 CGCTCGACCC CGACGGGACC GTCCTGATCA CCGGAGGCAC GGGGGCGCTG 
65 501 GGGCAGGTCG TGGCCGAGCA TCTGGTGCGG GAGTGGGGCG TACGGCACCT 
65551 GCTGCTGGCC AGCCGCCGTG GACTGGACGC CCCGGGCAGC GGTGAACTCG 
65601 CGGACAGGCT GTCGGACTTG GGCGCCGAGG TGACCGTCGC GGCGGCCG AT 
65651 GTGAGCGACC CGGCCTCGGT GGTGGAGCTG GTCGGCAAGA CGGATCCCTC 
65 7 01 GCATC CGTTG ACGGGTGTCG TGCACGCGGC GGGCGTGCTT GAGGACGGGA 
65751 TCGTGACGGC TCAGACGCCT GAGGGGCTGG CGCGGGTGTG GGCGGCCAAG 
6 5 801 GCCGCTGCGG CGGCGAATCT CCATGAGGCG ACCCGGGAGA TGCGTCTCGG 
65851 TCTGTTCGTG GTGTTCTCCT CGGCGGCCGC CACGCTCGGC AGTCCGGGCC 
6 5 901 AGGCCAACTA CGCGGCTGCC AATGCCTATT GTGACGCGCT GATGCAGCGC 

65 951 CGACGGGCGG CGGGCCAGGT CGGCCTGTCG GTCGGCTGGG GTCTCTGGG A 

66 001 GGCACCGGAC GCCAAGCCGG GTGTTGCCGC CGACG CC AAA CCGGATGTTG 
66 051 CCGCCGACGC CAAGACGGGA GTTGCCGCCG ACGGCACTCC CCAGGGCATG 
66101 ACCGGCACCC TGAGCGGCAC CGACGTGGCC CGCATGGCAC GCATCGGCGT 
66151 CAAGGCGATG ACC AGCG C AO ACGGTCTCGC CCTGCTCGAC GCCGCPlCPlCC 
66201 GCCACGGCCG CCCCCACCTC GTCGCCGTCG ACCTCGACAC CCGCGTCCTG 
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66251 GCGCACAAAC CCGCCCCGGC CCTCCCCGCC CTCCTGCGCG CCTTCGCCGG 

663 01 AGACCAGGGA GGCCAGG GAG GCGGCCGAGG CGGCGGTCGG GGCGGCGGCC 

663 51 CGGCACGACC GGCGGCCGCC ACCACCCGGC AGAACGTCGA CTGGGCCGCG 

66401 AAGCTCTCCG TCCTGACAGC CGAGGAACAG CACCGCACCC TCCTCGACCT 

66451 GGTACGGACG CACGGGGCAG CCGTCCTCGG GCACGCGGGC ACCGACGCCG 

66501 TACGCGCCGA CGCCGCCTTC CAGGATCTCG GCTTCGACTC CCTCACCG^G 

66551 GTCGAACTGC GCAACCGCCT CTCCGCCTCC ACCGGCCTGC GCCTGCCCGC 

66601 CACGTTCATC TTCCGGCACC CGACCCCGTC GGCCATCGCC GACGAACTGC 

66651 GCGCACAGCT GGCCCCCGCG GGGGCCGACC CGGCCGCGCC GCTCTTCGGT 

66701 GAACTGGACA AGCTGGAGAC GGTGATCACG GGGCACGCGC ACGACGAGAG 

66751 CACCCGGACC CGCCTGGCGG CACGCCTGCA GAACCTGCTG TGGCGCCTGG 

66 801 ACGACACTTC GGCCCGCTCG GACCACGCGG CCGGCGCGAG CGACGCCGAC 

668 51 GGCGACGCCG TCGAGAACCG AGACCTCGAG TCCGCGTCGG ACGACGAGCT 

6 6901 CTTCGAGCTG ATCGACCGAG AACTGCCTTC TTGATCAGGA GTGGAGAAGA 

6 6 951 CATGC CGGGT ACGAACGACA TGCCGGGTAC CGAGGACAAG CTCCGCCACT 

67001 ACCTGAAGCG AGTGACCGCG GATCTCGGAC AGACCCGTCA GCGCCTGCGC 

67051 GACGTGGAGG AGCGCCAGCG GGAACCGATC GCCATCGTCG CGATGGCCTG 

67101 CCGCTACCCG GGCGGGGTGG CCTCCCCCGA GCAGCTGTGG GACCTGGTCG 

67151 CCTCACGCGG CGACGCCATC GAGGAGTTCC CCGCCGACCG CGGCTGGGAC 

67201 GTGGCGGGCC TCTACCACCC CGACCCGGAC CACCCCGGCA CGACCTATGT 

672 51 ACGAGAGGCC GGATTCCTGC GGGACGCCGC CCGCTTCGAC GCCGACTTCT 

673 01 TCGGCATCAA CCCGCGCGAG GCGCTCGCCG CCGACCCGCA GCAACGGGTG 
67351 CTCCTCGAAG TGTCGTGGGA ACTG TTCG AG CGGGCGGGCA TCGACCCCGC 

674 01 CACGCTCAAG GACACCCtCA CCGGOGTGTA CGCGGGGGTG TCC AGCCAGG 
674 51 ACCACATGTC CGGGAGCCGG GTCCCGCCGG AGGTCGAGGG CTACGCCACC 
67501 ACGGGAACCC TCTCCAGCGT CATCTCCGGC CGCATCGCCT ACACCTTCGG 




67551 CCTGGAGGGC CCGGCGGTGA CGCTCGACAC GGCGTGCTCG GCATCGCTGG 

676 01 TCGCGATCCA CCTCGCCTGC CAGGCCCTGC GCCAGGGCGA CTGCGGCCTG 

67651 GCGGTGGCGG GAGGCGTGAC CGTACTGTCC ACGCCGACGG CGTTCGTGGA 

67701 GTTCTCACGC C AG CGCGG AC TCGCACCGGA CGGCCGCTGC AAGCCGTTCG 

67751 CCGAGGCCGC CGACGGCACC GG AT TCTCCG AGGG CGTCGG CCTGATCCTC 

67 8 01 CTGGAACGCC TCTCCGACGC CCGCCGCAAC GGACATCAAG TACTCGGCGT 

678 51 CGTAGGCGGA TCGGCCGTCA ACCAGGACGG CGCGAGCAAC GGCCTGACCG 

67 901 CCCCGAACGA CGTCGCCCAG GAACGCGTGA TCCGCCAGGC CCTGACCAAC 
67951 GCCCGCGTCA CCCCGGACGC CGTCGACGCC GTGGAGGCAC ACGGCACCGG 
68001 CACCACGCTC GGCGACCCGA TCGAGGGGAA CGCACTCCTC GCGACGTACG 
6 8051 GAAAGGACCG CCCCGCCGAC CGGCCGCTGT GGCTCGGCTC TGTGAAGTCG 
68101 AACATCGGCC ACACGCAGGC GGCTGCGGGC GTCGCAGGCG TCATCAAGAT 
68151 GGTGATGGCG ATGCGCCACG GCGAGCTGCC CGCCTCCCTG CACATCGACC 
68201 GGCCCACGCC CCACGTGGAC TGGGAGGGCG GGGGAGTGCG GTTGCTCACC 
682 51 GATCCCGTGC CGTGGCCACG GGCCGACCGC CCCCGCCGCG CGGGGGTCTC 
6 8301 CTCCTTCGGC ATCAGCGGCA CCAACGCCCA CCTGATCGTG GAACAGGCCC 

68 3 51 CCGCCCCGCC CGACACGGCC GACGACGCCC CGGAAGGCGC GGCAACCCCC 
684 01 GGCGCTTCCG ACGGCCTCGT GGTGCCGTGG GTGGTGTCGG CCCGTAGTCC 

684 51 GCAGGCCCTG CGTGATCAGG CCCTGCGTCT GCGCGACTTT GCCGGTGACG 

685 01 CGTCCCGAGC GCCGCTCACC GACGTGGGCT GGTCTTTGCT GCGGTCG CGT 
68551 GCGCTGTTCG AGCAGCGGGC GGTGGTGGCG GGGCGTGAGA GGGCTGAACT 
68 601 GCTGGCGGGG CTGGCTGCGT TGGCCGCTGG TGAGGAGCAC CCGGCTGTGA 
6 8651 CGCGGTCCCG TGAGGAAGCG GCGGTTGCTG CG AGCGGTG A TGTGGTGTGG 
6 8701 CTGTTCAGTG GTCAGGGCAG TCAGTTGGTC GGTATGGGTG CTGGTTTGTA 
68751 TGAGCGGTTC CCGGTGTTTG CGGCTGCGTT TGATGAGGTG TGCGGCTTGC 
6 8 801 TGGAGGGGGA GCTGGGGGTT GGTTCGGGTG GGTTGCGGGA GGTGGTGTTC 
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68 851 TGGGGCCCGC GGGAGCGGTT GGATCACACG GTGTGGGCGC AGGCGGGGTT 

68901 GTTTGCG TTG CAGGTGGGGT TGGCCCGGTT GTGGGAGTCG GTCGGGGTGC 

6 8 951 GGCCGGATGT GGTGCTCGGG CATTCG AT CG GTGAGATCGC GGCCGCGCAT 

69001 GTGG CGGGGG TCTTTGATCT GGCGGATGCG TGTCGGGTGG TGGGGGCGCG 

69051 GGCGCGTTTG ATGGGTGGGT TGCCTGAGGG TGGGGCGATG TGTGCGGTGC 

6 9101 AGGCCACGCC CGCCGAGCTG GCCGCGGATG TGGATGGCTC GTCCGTGAGT 

6 9151 GTGGCGGCGG TCAACACACC TGACTCGACG GTGATTTCAG GTCCGTCGGG 

6 9201 TGAGGTGGAT CGGATTGCTG GGG TGTGGCG GGAGCGTGGG CGTAAGACGA 

6 9251 AGGCGCTGAG CGTGAGTCAT GCTTTCCATT CGGCGTTGAT GGAGCCGATG 

6 9301 CTCGGGGAGT TCACGGAAGC GATACGAGGG GTCAAGTTCA GGCAGCCGTC 

6 9351 GATCCCGCTC ATGAGCAATG TCTCCGGAGA GCGGGCCGGC GAGGAGATCA 

69401 CATCCCCGGA GTACTGGGCG AGGCATGTAC GCCAGACAGT GCTCTTCCAG 

6 9451 CCCGGCGTCG CCCAAGTGGC CGCTGAGGCA CGCGCGTTCG TCGAACTCGG 

6 9501 CCCCGGCCCC GTACTGACCG CCGCCGCCCA GCACACCCTC GACCACATCA 

6 9 551 CCGAGCCGGA AGGCCCCG AG CCGGTCG TCA CCGCGTCCCT CCACCCCGAC 

6 9601 CGGCCGGACG ACGTGGCCTT CGCGCACGCC ATGGCCG AC C TCCACGTCGC 

6 9651 CGGTATCAGC GTGGACTGGT CGGCGTACTT CCCTGACGAC CCCGCCCCCC 

6 9701 G C ACCGTCG A CCTGCCCACC TACGCCTTCC AGGGGCGGCG CTTCTGGCTG 

6 9751 GCGGACATCG CGGCGCCCGA GGCCGTGTCC TCGACGGACG GTGAGGAGGC 

6 9801 CGGGTTCTGG GCCGCCGTCG AAGGTGCGGA CTTCCAGGCG CTCTGCGACA 

6 9851 CCCTGCACCT CAAGGACGAC GAGCACCGCG CGGCTCTGGA GACGGTGTTC 

6 9 901 CCCGCGCTGT CCGCGTGGCG GCGCGAACGA CGTGAGCGGT CGATCGTCGA 

6 9 951 TGCCTGGCGG TACCGGGTCG ACTGGCGGCG CGTCGAGCTG CCGACACCCG 
700 01 TTCCGGGCGC CGGTACCGGT CCCGACGCCG A CACGGG CCT CGGGGCGTGG 

7 0051 CTGATCGTGG CTCCCACGCA CGGGTOGGGT ACTTGGCCGC AAGCCTGTGC 
7 0101 CCGGGCG TTG GAGGAGGCGG GCGCGCCGGT ACGTATCGTC GAGGCCGGCC 
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70151 CGCACGCCGA CCGGG CGG AC ATGGCGGACC TGGTCCAGGC ATGGCGGGCA 

702 01 AGCTGTGCGG ACGACACCAC CCAGCTCGGA GGAGTGCTCT CCCTGCTGGC 

702 51 TCTCGCCGAG GCACCGGCCA CCAGTTCCGA CACCACTTCC CACACCAGTA 

703 01 CCAGTTGCGG TACCGGCTCT CTCGCGTCCC ACGGCCTCAC CGGCACCTTG 
70351 ACGCTGCTGC ACGGTCTGCT GGATGCGGGC GTCGAAGCGC CTCTCTGGTG 

704 01 TGCCACGCGC GGCGCCGTGT CGTGCGGCGA CGCCGATCCG CTCGTCTCtC 
704 51 CGTCGCAGGC CCCGGTCTGG GGACTCGGAC GCGTGGCCGC CCTGGAGCAT 
7 0501 CCGGAGTTGT GGGGCGGCCT GGTCGACCTG CCCGCCGACC CGGAGTCGCT 
7 0551 CGACGCGAGC GCGTTGTATG CGGTTCTGCG CGGAGACGGC GGCGAGGATC 
7 0601 AGGTCGCGCT GCGCCGGGGC GCGGTCCTCG GCCGTCGCCT GGTGCCCGAC 
70651 GCAACCCCGG ACGTGGCCCC CGGCTCGTCC CCGGACGTGT CCGGAGGCGC 
70701 AGCCCATGCC GACGCGACCT CCGGGGAG1G GCAGCCGCAT GGTG CCGTCC 

707 51 TCGTCACCGG AGGCGTCGGC CACCTGGCCG ATCAGGTCGT ACGGTGGCTC 
70 801 GCCGCGTCCG GCGCCGAACA CGTCGTACTC CTGGACACGG GCCCCGCCAA 

708 51 CAGCCGTGGT CCCGGCCGGA ACGACGACCT CGCCG CGGAA GCCGCCGAAC 
7 0901 ACGGCACCGA GCTGACGGTC CTGCGGTCCC TGAGCGAGCT GACAGACGTA 
70951 TCCGTACGTC C C ATACGG AC CGTCATCCAC ACATCGCTGC CCGGCGAGCT 
71001 1 CGCGCCGCTG GCCGAGGTCA CCCCCGACGC GCTCGGCGCG GCCGTGTCCG 
71051 CCGCCGCGCG GCTGAGCGAA CTCCCCGGCA TCGGGTCAGT GGAGAGCGTG 
71101 CTGTTCTTCT CCTCCGTGAC GGCTTCGCTC GGCAG TAGGG AGCACGGCGC 
71151 GTACGCCGCC GCCAACGCCT ACCTCGACGC CCTGGCGCAA CGGGCCGGTG 

712 01 CCGATGCTGC GAGCCCCCGG ACGGTCT CGG TCGGGTGGGG CATCTGGGAT 
71251 CTGCCGGACG ACGGTGACGT GGCACGCGGC GCCGCCGGGC TGTCCCGGAG 
71301 GGAGGGACTC CCGCCGCTGG AACCGCAGTT GGCGCTCGGC GCCCTGCGCG 

713 51 CGGCGCTCGA CGGGGGCAAG GGGCACACGC TGGTCGCCGA CATCGAGTGG 

714 01 GAGCGGTTCG CGCCGCTGTT CACGCTGGCC AGGCCCACCC GGCTGCTCGA 
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714 51 CGGGATCCCC GCGGCCCAGC GGGTCCTCGA CGCCTCCTCG GAGAGCGCCG 
71501 AGGCCTCGGA GAACGCCTCG GCCCTCCGTC GCGAACTGAC GGCCCTGCCC 

715 51 GTGCGGGAGC GGACCGGGGC ACTTCTCGAC CTGGTCCGCA AACAGGTGGC 
71601 CGCCGTCCTG CGCTACGAGC CGGGCCAAGA CGTGGCGCCC GAGAAGGCCT 
71651 TCAAGGACCT GGGCTTCGAC TCGCTCGTGG TCGTGGAGCT GCGCAACCGG 
71701 CTGCGCGCCG CCACCGGGCT CCGGCTGCCC GCCACCCTGG TCTACGACTA 
71751 CCCCACACCC CGCACCCTCG CCGCACACCT GCTGGACAGG GTGCTGCCCG 
71801 ACGGCGGCGC GGCAGAGCTC CCCGTGGCCG CCCACCTGGA CGACCTGGAG 

718 51 GCGGCCCTCA CCGACCTGCC GGCCGACGAC CCCCGGCGCA AGGGCCTGGT 

719 01 CCGGCGTCTA CAGACGCTGC TGTGGAAGCA GCCCGACGCC ATGGGGGCGG 
71951 CGGGCCCCGC CGACGAGGAG GAGCAAGCCG CGCCCGAGGA CCTGTCGACC 
72001 GCGAGCGCCG ACGACATGTT CGCCCTGATC GACCGGGAGT GGGGCACGCG 
72051 GTGAGCGGGG TGGAGCGGGG TGTGGGGTCG GCGGGCCCTG TGGAACAGGG 
72101 TGACGGACTC GCGGGCCTGG TCGAGCGGGC CGAGGCGCTG GCCGCTCTGC 
72151 GGGGCGCCTT CGACGGCTCC CCGGGCACCG GCGGCAGCCT CGTCGTGCTC 
72201 AGCGGCGCGG TGGGCACCGG CAAGACCGCG CTGCTACGGG CGTGGGCCGA 
72251 CCGCATGGGC GCCGATGCCG ACGCCCTGGT CCTGACCGCC ACCGCCTGCC 
72 3 01 GCGCCGAGCG CGACCTGCCG CTTGGCGTCC TGGAACAGCT GGTACGCAGC 

723 51 CCCGGCCTGC CCCCGGCCAG CGCCGAGCGC GCGCTGGCGT GGTGGGACGA 

724 01 GGAGGCCTCG GCCACCCCCG GAAAGACGGA CGCGAACGGG ACGAGTGCCA 

724 51 ACGGGACGGA CGCCAACGGG ACGGGCGCGG GACAGACGGG CGCGGGGCAG 

725 01 GCGGGCGTGG GACAGACGGG CGTGGGCGGA GAGCCCGTCC TGGCCGCCTC 
72 5 51 CGCCCTGCGA GGCCTGTGCG AGGTGCTGCG GGACCTGCTC GCCGAGCGGC 
72 601 CCGTCGTGGT CGCCGTCGAC GACGCGCACC ATGCCGACGC GGCGTCG CTC 
72651 CAGTGCCTGC TCTCCGTGGT GCGCCGGCTG CGGTCGG C AC GACTCCATGT 
72 701 G C TGTTCACC GAG T ACGCCC ATCAGAAGGC GCAGAACGCC CTGCTGAGCA 
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72751 GCGAGTTCCT GCACGAGCCC GCCCTGCGGC GGATCCGCCT GGAACCGCTG 

72801 TCGAAGGCGG GCGTGGAGGC CTTGCTCGCC CGGCACCTCG ACGAGCGGAC 

72851 GGCACAAGAC CTCACCCCCG TCGTCCACGG CATGAGCGCG GGCCACCCGC 

72 901 TCCTCGTACG GGCGCTGGCC GAGGACCACC GTGCGGCGGG CGGCGCCGGG 

72 951 GAGGCGTACG GTCGTGCCGT CCTCAGCTTT CTGTACCGGC ACGAGACTCC 
73001 GGTCACCCAA GTCGCCCGCG CCATCGCTGC GTTGGGCGCG CACGCCGGAC 
73051 CCGGTCAGGT CGGGCGGCTG CTCGATGTCG ACGCGGCGTC CGTCGAGCGG 
73101 GCCGTGCGGC AGCTGACCGT CGCGGAGGTG CTGCACGAGG GCCGCCTGTG 
7 3151 CCACCCGGCG TTCGCGGCGG CGGTCCTGGA CGGCATGCCG CGCGAGGAAC 

73 201 GCCGCGCCCT GCACGGACGG GTCGCCGACC TCCTGCACGA GGAGGGGGCG 

732 51 CCGGCCACCG AAGTGGCCGC CCACCTCGTC GCCGCCGACC GGTCCGACGC 

733 01 CCCGTGGGCG GTACCCGTCT TCCAGGAAGC GGCCCAACTC GCCCTGGACG 

733 51 AGGACCAGGT GGAGACCGGC GTCGACTATC TGCGCGCGGC CCACCAGCGG 

734 01 TGCCGGGGCG CCGCGCAGCG TGC CGCGGTC GTCGGTGCGC TCGCCGACGC 
73451 CGAGTGGCGG CTCGACCCAG CAAAGGTCCT GCGCCACCTG CCCGACCCTG 
7 3 501 CAGCCATGGC CCCACAAACG GACCCTGCCG CCCTGGCCCC AC AC ACG G AC 

735 51 CCCGCACCCA CAGCCGCACC CACAGCCGCC CCCACCCCCA CCCCCATCCC 
7 36 01 GACCACCCCA CCCCTCCCCA CCCACCTGCT CTGGCACGGG CGGGTCGAGG 
73 651 AAGGCCTGGA CGCCATCGGC ACGCTCACCG GGCCCGGACC CAACCCGGCG 
73701 GGTGCGCCGC CGATGAACCC CGCGGACCTG GACACCCCAT GGCTGTGGGG 
73751 CGCCTACCTC TATCCCGGGC ACGTCAAGGA GCGCCTGGGA TCCGGCGCCC 
73801 TGTCCCCGCA GCGCTCGACC CCGCCGGCGG TCACGCCGGA GCTCCAAGGC 
73 851 GCGGGCACGC TGATGAACGA CCTGCTGCAC GGCGGCGAAC GCGACGCCAC 
73 901 CGAGGCCGCC GAGCGCGCCC TCAACCGCTA CCGGCTCGGC CCCCGCACCA 
7 3 951 TCGCGGTCCA GACGGCCGCG CTGGCCGCCC TCACCTACCG CGACCGGCCG 
74001 CACCGCGCGG CCGCCTGGTG CGACGGCCTC GTCGCCCAGG CCGACG AG CG 
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74 051 CAACAGCCCC ACCTGGCGGG CCCTGTTCAC CGCGTGGCGT GCCCTGCTCC 

74101 ACCTGCGGCA GGGCGACCCG GCCGCAGCGG AACAGCGCGC CGAAACCGCC 

74151 CTCGCCCTGC TCGGATCGAA GGGCTGGGGC GCCGCGATCG GCCTGCCGCT 

742 01 GGCAGCCGCC GTACAGGCCA AGGCGGCCCT CGGCGATGTC GACGGGGCGG 

74 251 CGGCCCTCCT GGAACGGCCC GTGCCCCAGG CGGTCTTCCA GACCCGCACC 

74 3 01 GGACTGCACT ACCTGGCGGC CCGGGGCCGC TATCACCTCG CCACCGGCTG 

74351 CCACTACGCC GCACTGTGCG ACTTCTACGC CTGCGGGACC CGCATGAGCA 

744 01 GCTGGGGAGT GGACCTGCCC GCGCTGGAGC CGTGGCGCCT CGGCGCGGCG 

744 51 GAAGCGTACC TGGCCCTCGG CGAAGGACTC CTGGCACGCC AACTCGTCGA 

74 SOI CGGCCAG CTG CCGTTGCCCA CGCCTGACGA CGGCCGCACC TGGGGCATGA 

74 551 CGTTGCGCCT GCGGGCGGCC ACGTCCCCCG CGCCGGCCCG GGCCGAACTC 

74 601 CTCGACGAGG CCGTGG CGGT GCTCCGGGAG AGCGGCGACA CCTTCGAGCT 

74 651 GGCGCGGGCC GTCGCCGACC AGGCTGTTGC CGTACGCGAA GGGGGCGAGG 

74701 CGGAACGCGC CCGGCTGCTG GCCCGCAAGG CGGAGCTGCT GGCCCGGCGC 

74751 TGGGGCAGCG CCCCCGCGCC CGCCACCGTC CCCGAACCGC CGGAGCGGCC 

74 601 AGGACCGGCC ACTCCGGACG CCGAACTGAC CAGTGCGGAG CGGAGGGTGG 

74 851 CCGAGCTGGC CGCCGAAGGG TTCACCAACC GGG AG AT CT C CCGGAAGCTG 

74 901 TGCGTCACGG TCAGCACCGT GGAACAGCAC CTGACCCGGA TCTACCGGAA 

74 951 GCTCGACGTC AGGCGACTGG ACCTCCAGGC AGCCCTCGGC TGACCTTCAG 

75001 GCGGCCCTCG GCTGACCGCA GGCCACGCGC CTACGGTCAG CC TTCCTG AG 

7 5051 TCAGGACCGT ACAGCCGCCG TAGGTGTAGG TGTAGGCGTG GGCGAGATCG 

7 5101 TCGCCGCGTC CAGACCCACC ACGGCCAGCT CCTCGGGAAG GAACGGGGGA 

75151 GCGGTCAGCT CCGGGAGGCG TTCGTCGGCG CGCATCGCCA TCAGGAAACG 

7 5201 GTTGGAGCCC AGTTCGGCCT GCGGCGCGTT GAGGCTCATC ACGTCCGTGA 

75251 CGATCTCGGA CGCCTTCGGG G AACGG AT CG ACGCCGCGGT GATGGCCTCG 

7 5301 GCGAACCGCA GACGCTGCTC GG TG TCCAC A CGGATGAGCC GCGGATCCGT 
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753 51 CGCCGAGACA CGGCAGTTGA CGTAGTCGAT GTCCTTGGTC GCGGCGAGGA 
7 5401 TCCACGGGTC GTCCACGGCC GCGCCGATCG CCTTCTGCAG GGCGCGGGTG 

754 51 CCGGCGCGGG CGGACCCCGT ACCCTCCTGC ACGCTCCGCT CGAACTCGCG 

755 01 GTCGATCGTG GTGGCGCAGC GCGCGGCCGA GCTCATGCCG TGGCCGTAGA 
75551 TCGGGTTGAA AG CGGTC AG C GAGTCGCCGA TGACGAGCAG ACCGTCGGGC 

756 01 CACTGTTCGA GGCGCTCCGG ATAGAGGCGG CGGTTGGCGC CGG AG CGC?GA 
756 51 ACCGAAGACG GGGGTGAGTG GTTCGGCGTC C CGG AG C AGG TCGGCGAGGA 
75701 TCGGGTGGTT CAGGTTCTCG GCGAAGGGGA TGAACTCGTC CTCGTGTGTG 
75751 GGCAGTTGCG CGCCCCGCGT GCAGGAGAGC GTCGCGAGCC AGCGGCCGCC 
75801 CTCGATGGGG TAGACCACGC CGAAGCGGCC GGGTTCGCGC ACGCGGTCGT 
75 851 CGGCGGCGAT GTTCACGGCG GGGAAGTGCG TCGTAGCGCC CGGCGGGGCC 
7 5 901 TTGAAGAGCC GGGTGGCGTA GGCGACGCCC GCGTCCACGA CGTCTTCCTC 
75 951 CAGTGCCGGC ACGCCGAGGG CGGCGAGCCA CTGCTTGAGG CGGGAGCCGC 
76001 GCCCGGTGGC GTCGATCACC AGGTCGGCCT CCAGCTGCTC CTGCCGACCG 
76051 CTGTCGAGGT CGCGGACGAC GACACCGGTG ACCCGGCCGC CACTGCCACC 
76101 ACCACTTCCC G TCAGCTCG A CGGCCTCGGT GCGCTGCCGG ACGGTGATGT 
76151 TGTCGG CTCC CAAGGCCTGC TGACGTACCG TCAAGTCCAG CAGCGGGCGG 
762 01 CTGGCGACCA GCGCGAACTG GGTGGCGGGG AAGCGGTGCT GCCACCCCTG 
7 6251 AC CGGTC AG C GTCACCAGGT CCTCGGGGAA GCCGAGGCGG CGGGCGCCGG 
7 6301 CCGCGAGGAG GCGGTCGGTG GTGCCGGGCA GCATCTCCTC GATGAGGCGG 
76351 GCGCCGTTGG ACCACAGGAG GTGCGCGTGG CGGGCCTGCG GGACCCCCTT 

764 01 GCGGTGCTGG GGCTCCTCGG GCAGCGCGTC ACGTTCCACG ACGGTGACGG 
76451 CGTCGACGTG CCGGGCCAGG ACGTGGGCCG CCAGGGTGCC TGCCATGCTG 
7 6501 GC ACCCAGGA CGACGGCATG TGCGGGTCGG GTGG TGGTCA CGCGCGTATC 

765 51 CCTTCGGGGT GGG TGGTGTC GGCGGGCCCG GCCGGATCGT CCATGGTCAC 
7 6601 GTCCGTGACG CCCCAGAACG CCTGGACCCG GCGGCCGAGC CCGTGCTCGT 
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76651 CGAGTTCGAC GATGCCGACG ATGCGGAAGG TCATCGGCCG CGGCCGCTGC 

7 6701 ACGGTGACCG TGGTCGGCGT CACCACGAAA CGGTCGTCCA TCGACGTCAT 

76751 CGGCGGGTCC GGCACCTCGT GCGTACCGCA GGAGACGGCC AGTTCGAGAT 

76801 GGCGGCGGAG ATCGTCCTTG CCCACCATCG GGGGCCGCCC CACGGGGTCC 

7 6 851 TCGAAGACGA TGTCGTCCGT GAACAGGTCG AGGACGCCTT CGATGTCACC 

7 6 901 GGCGTTGATG CGCTCGGCGT AGTCGACGGC CATCTGCTTG CG CGCGG(?C T 

7 6 951 CGTCGGG CAT GGCACCTCCA GGAAGGGTGG GC AG A CCTTG TGAAAGTCAT 

77 001 CGAGGGCCGT TCGGTTCAGC CGAGGACCGT GAGATCGGAT GTGCCCCAGT 

77051 ACGACTTCAG ATGCCGGATG AGGCCGGACG CGTCCATGCG GATCACGAGC 

77101 ATCGCCGTGC GGTGTATGCG GGCCGTCCCC GGGGCGTCGG GGG CCTTG AG 

7 7151 CCAGCCCCGC TCCGCGTAGA GCGGGCCCAC GGG CAGGTAG TCCATGACGG 

772 01 AGGAAATCTG GATCAGCGCG TGCGTGG CGT CCTGCCCGGC GACGGGCTCG 

772 51 GCCGCCTCCT CGCGCAGGTG CGCGGCGAGC AGCGGTTCGT AGTGGGCGCG 

77301 CAGCGCGTCG TGCCCGGTGA CGGGCGGGAG GCCGACCGGG TCCTCGAGGA 

7 7351 CCGCGTCGGG CGCGTACAGA TCGATGATCG CGTCCAGGTC CCCGGCGTTG 

7 7401 ATCCGCCGGC TGTGCTCCAG GGCCCGCTTC TTGCGGGCGA ACTCGTTCAT 

7 7451 CGCTGCCCCT CCACTGCCTG ACCGTGTCCG TTGCCGTTGC CGTTGCCGTT 

7 7501 GCCGTTGCCG TGTCCGTTGC CCTGCCCGGT GGGCTGTCCG TTGCCCTGTC 

77 551 CGCTCGCGCC GTCCCTGCCG AGGTC CCGGT CGATGAACGC GAAGATCTCG 

7 7 601 TCCGCCGACG CGTCCTGGAT ACGTGTACGA GTGGCCACCG CGACCTCGCC 

7 7651 GGCCGTGTCC TGCGGCGCGT CGAGCCTGGC CAGCGTCGCG CGCAGCCGCC 

7 7 701 CCGCCAGTTC GGCCCGCGCC GAGCCGTCCT TCGAGGAGAC CGAGAGCAGC 

77751 GAGTCCTCGA TGCGCTCGAA CTCCGCCAGG ACGTCGGCGA GCGGATCCGC 

77 801 CGCGCGCGGG GCCAGCTCCT GCCGCAGCTG CGCGGCGAGC TCCGCCGGGT 

77 8 51 TGGGATGGTC GAAGACGAAC GTGGCGGGCA GCTTCAGCCC CGTCGCGGCC 

7 7 901 GAG AGCCGGT TGCGCAGCTC CACCGCGGTC AGGGAGTCGA AGCCGAGTTC 
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77 951 CCGCAGCCCC TGCGTCGCGT TGACGGGCGT GGCCGCGTCG TAGCCGAGGA 
78001 CGGCCGCGAT ATGGGTG C AC ACCAGGTCGA GCAGCGCCTC CTCCCGCTCG 

78 051 GGGTCGGACA TCGCGCCGAG CGACTTGAGC AGCGCGGCCG CCCCCGCCGA 
78101 CACGGCACCG CCGCCGCTCT TGCTCCCCCC GCGCACCAGG TCGCGCAGCA 
78151 GCGCCGGTGC GGGGTGGCTC TGGGCCTGCC GGCGCATCCG GGCCAGGTCC 
7 8201 AG ACGG AC CG GCGCGTACAG GGGCAGTCCG CCGGCCCACG CCGCGTCGAG 
7 82 SI GAGGGCGAGT CCCTCGTCGG CGCCGAGCCC GACCACGCCG GCGCGGGCAT 
78301 GGCGCGCCCG GTCGGCGTCG GTGAGCCGTC CCGACATGCC GCTCGCCAGC 
7 8351 TCCCAGTAGC CCCACGCCAG GGAGGTCGCC GCCGCACCGC CGTCGTGCCG 
7 8401 GTGCCGGGCC AGCGCGTCCA AGAAGGCGTT GGCGGCCGTG TAGCTGCCCT 
784 51 GGCCGGGGCC GCCGAGCAGC CCGGCGACCG AGGAGTACAG GACGAACGCG 
78501 GACAGGTCCG CGTCCCGCGT CAGCTCGTGC AGGTGCCACG CGGCGTCCGC 
7 8551 CTTCACGCGC ATCACCTCCT CGACCTGCTC GGCCGTGAGG TTCTGCACCA 
7 8601 CGGCGTCGTT CACGGTGCCC GCGCAGTGGA AGACGGCGGT CAGCGGGTGG 
78651 TCCGAGGGCA CCGCCGCGAG GAGGGCGGCG GCTTCGTCCC GGTCGCCCGG 
7 8701 GTCGCACGCG GCGAAGGTGA CTCGCGCGCC GAGCGCGGAG AGGTCGGCGG 
7 8751 CC AGTTC GAG TGCGCCCGGC GCGTCGGCTC CCCGCCTGCT GGACAGCAAC 
7 8 801 AGGTGCCTGG CTCCGTACCG TTCCACCAGG TGACGGGCCG TCAGCGAGCC 
78 8 51 GAGTGCTCCG GTGCCGCCGG TGACCAGCAC GGTGCCCTCG GGGTCGAAGG 
7 8 901 CGGGAGGCAG CGAGAAGACG GTCGTGCCCG CCGAGGGCGG GGCCGCCATC 
7 8 951 GCGGCGGGCG CCTGCCGGAT GTCCCACACG GTGATGTCGA GCGGCGTCAG 
7 9001 AGCACGGCTG TCACCCCGTT CCGCGGGCAG CCCGGGCTCC GCCGACTCCG 
7 9051 TGATCTCGGC AAG CTCGGTC AGCTCGGTCA GCTCCGCGAG GATTTCCCGT 

7 9101 ACGCGCCCGG GCTCGGGCGG CACGACAGCC TGTCCCTCGT CCGGACGACC 

79151 GCCCGCACGG TGGACCACCA GGGCCCCCTC GTGGCGGAGG GTGACGTCGG 

7 9201 CCGCGCGCTC GGCGCCCGAA TCATCCGCCG TCGACGCACC GTCCACGGCC 
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7 9251 TCGACCCGCC ACCGCCCGGC AAGAGCAAGA CGCAGCACGG CACGGCCGAC 

79301 GGAACCGGTT TCCTCCCCGA CGAGCAGAGT CTCGCCGCCG CGCGGCGCCA 

793 51 CGACATCCGC CAGCACGTGA TACGCGGACA CATAGGCCCC CAAGGACCCG 

7 9401 GCCGCCTGCG CCCAACTCCA GCCCGCCGGA ACCGGCATAA GCAGCGCGGC 

7 9451 ATCGGTGACG GCCACCGGGC CCACCGCGTC GAACAACCCC ATCACCCGGT 

7 9501 CGCCCACGGC CACCGAACCG ACCTCGCCGC CGACTTCCGT CACCACACCG 

79551 GCACCCTCGA CCTGGCCCGC CGTGAGCGGC CCCGGCGCCG CGGCCCGCAC 

7 9601 CGCCACCCGC AC CTCGTG CG GCTCCAGCGC CCGTCCGGCC TCGGGAGCGT 

7 9651 CGACAAGGGA CAACTGCTGT CCGCCGCCCG CCTCTTGGCA CCGAGCCAGC 

7 9701 CGCCACGTGA GCGATCCGAC CGGCGGCACC AGCCGCACCG ACGCGTCGTC 

7 9751 GCGCACGAGC CGTGGCACGT AGGCGCGCCC GTCACGCAGC GCCAATTCCG 

7 9801 GTTCGCCGGA GGCCAGTACG CCGGTCAGCG TGGCCGGAGA AGACTCCAGT 

7 98 51 CCGTCCACGT CGAGCAGCGT GAGGCGACCG GGATTCTCGG CCTGCGCGCT 

7 9901 GCGCACCAGA CCCCACAGCG ACGCGCCCGC CAGATCACCG GCGGTCTCAC 

7 9 951 CCGGCCGCGC GGCGACCGCG CCTCGGGTGA CGACGACGAG ACGGGTCGCC 

8 0001 GCGAACGCCG GGTCGTCCAC CCACTCCTTG AGCAGCGACA GAAGGGACAC 
8 0051 GGTGGCCAGC CGCGCGTACC CGGCCGGGTC GCCGCCCCTG CCATCGGCAT 
80101 CCGCAACGGC CCCGGCACCT GCGC CGGGCG CGGCGCACAC GGCGAGCACG 
8 0151 ACATCGGGCG GTTCGCCCCC AGCCGCCACT CCGTCCCGGA G CG C AC CG AA 
80201 CGTGTCCCAC ACGGGGCCGG CGGCCAGCGC ATCGGACAAG GCGTCGGCCA 
80251 GCGCACCGGC CGACGTACCG CCCATCGGGC CACTCTCGAC CGGCGCGAGG 
8 0301 ACCGCGGCAC GCGGGGCGCC GCCGCCCGTC TCCTCGGCCC GCGCGGCGAC 
8 0351 CTCCATCCAC ACGAGCCGGA ACAGCGCGTC ACGGTCCGCC GCACGGGCGC 
8 04 01 CCGCGATCTG GTGGGCGGCC ACCGGCCGTA CCG TGAGCG A CTCCAGCGTG 
8 0451 AGAACCGGCT CCCCGCCTCC GCCCCOGTCC ACGGCCGTGA GGGCCAGCTG 
8 0 501 GTCGGGCGCG GTGCG TGCGA TACGTACCCG CAACTTCTCA GCGCCCGGCG 



-62- 



SUBSTITU;^ SH££T (RULE 26) 



1 1 SEPTEMBER 2000 

8 0 551 CGTGCACCCG CAACCCGCTC CAGGAGAACG GCAGCAGCAC TTGGTCCG TG 

80601 TCGGCGGACG ACGTGACCGC GTCCAGGATC AGCGCGTGCA GCGTGGCGTC 

80651 GAGCAACACC GGGTGCACCT GGTAGCGGTC GGCCCTGCCG CTCTCCGCCT 

80701 CGGGCAGCGC CACCTCGGCG AAAAGGTCGT CCCCGAGCCG CCACGCGCTC 

80751 ACCAGTCCCT GTGAGCCGGG CCCGAAGTCA TAGCCGTACG AAGCGAGTTC 

80801 CCCGTACGGA TCCTGCTCGC CG AC CGGTGT GGCGCCCGGG GGCGGCCACG 

80851 TCCCGCCGAA CGAGGCGTCC CCGGCGTCGG GCCCCGGGGG AGCGACCACG 

80901 CCCGCGGCAT GCCGGGTCCA CACGGCCTCC TCGCCCTCAC CCGTGGGCCG 

80951 CGAATGGACG GTCACGGGAC GCCGCCCGTC CTCGGCCACG G AAC CG AC C A 

81001 CCACCTGCAC GTCGACCGCG CCCGCACCCT CGTCCCCGAA GGCGAGCGGA 

81051 GTGTGCAGCG TCAGCTCCGC CAACTCCGCG CAGCCGGCCC GCACCGCGGC 

81101 CTGCAGCGCG AGCTCCACGA ACGCCGAACC GGGCAGCAGC AC CG TG TC CA 

81151 TGACCCGGTG CTCGGCCAGC CACGCCTGGT CCCGCGGAGA GATCCGGCCG 

812 01 GTCAGCAGGT GACTGCCGCC GTCCGCGAGT TCCACGGCGG CTCCGAGCAG 

812 51 CGGATGCCCC GCGGACGCGA GCCCGAGCCC CGCCGGGTCC CCGGCGAGCC 

813 01 CCCTGCGCCC CTCCAGCCAG AACCGCTCCC GCTGGAAGGC GTACGTCGGC 

813 51 AGATCCACCA CCCGAGGCAG CGGCACGGCC GGGAACCAGC CCGTCCAGTC 

814 01 GACCTCCGCC CCCGCGCCGA AGGCCTGGGC GGCCGCGCGG GTGAGCTGCG 
814 51 CGGCGTCGCC GTGGTCGCGG CG.CAGGGTGG GCACGACGGT GGCGGGCATG 
81501 TCGGCCCGCT CGATGGTCTC CTCCATGCCG AGGTTGAGGA CGGGGTGGGG 
81551 GCTGGCCTCG ATGAACAGGC GGTAGCCGTC GGCCAGCAGC GCTTCGATGG 
81601 TGTCGGCGAA GCGGACGGGC TGGCGGAGGT TGGTGACCCA GTAATC CGTG 
81651 TCGAGGGTGG TGGTGTCGTC G AGG CGVTCG GCGGTGACCG TGGAGTAGAA 
81701 GGCGACGTCC GTGGTCGTGG GCCGGATGTC GGCCAGGCGC TCGG TGAGG A 
817 51 GGTCGTGGAG CTGG TCG ATC TGGGGGCCGT GGGAGGCGTA TCCGACGTCG 
81801 ATGACGCGGG CGCGCAGGCC TCGCGCCTCC GCATCCGCGA CCACGGCTGC 
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81851 CACATGCTCC GGCGGCCCTG AAATGACCGT AGAGGAGGGC CCGTTGACGG 
81901 CAGCGACACA CACGCCGGGC CGGTCGCCGA TGAGCTCAGC AACCTGCTCC 
81951 GAGCCGGCCC CCAACGACGC CATGTCGCCC TGCCCCATGA GCTGACGGAG 
82001 CGCGTCACTG CGTACGGCCA CGATCCGCGC CGCATCCTCC AGTGACAGTG 
82 051 CCCCCGCCAC ACACGCGGCG GCCATCTCGC CCTGCGAGTG CCCGATGACG 

82101 GCAGCCGGGG TGATGCCGTA ATCGGCCCAC ACCGAAGCCA GCG AG AC CAT 

82151 CACCGCCCAC AACACGGGCT GCACGACCTC GACCCGGGAC AGCTCACTCC 

82201 CGTCCCCGCG CAACACCGCA CTCAGCGACC AGTCCACATG CGCCG AC AGG 

8 2251 GCCCGCTCAC ACTCCGCGAT CCGCGCCGCG AAGACGGGGG ACTCGTCAAG 

82 301 GAGCTGGGCA CCCATGCCCA CCCACTGCGA CCCCTGCCCC GGAAACACCA 

823 51 ACACCGGACC CGCGCCGGAG GCGCCCTGTA CGGCGCCCTC GACGACGTCC 
82401 GGTGACGGCT CGCCCGCCGC CAGGGACCGT AG CCCGGCGA GGAGAGTCTG 

824 51 GCGGTCCTTG CCCACGACGA CGGCTCGGTT CTCGAACACC GACCGGGTCT 
82 501 TGACCAGGGA CCAGCCCACG TCCAGCGGCG ACGCGAGCCG CGGGTCGGCG 
82 551 GTGGCGCGGT CGGCCAGCAG GCGGGCCTGG GCCCGCAGCG CCTCCTCGCC 
82601 GCGCGCCGAC ACCACCCAGG GCACCACTCC GGCCGGCGCC GCGGCGTCCT 
82 651 CCGCCGGAGC GGTCACGGGC TCCGGCGCGT CCGGGGCCTG TTCCAGGATG 
82 701 AGGTGCGCGT TGGTGCCGGA GATGCCGAAG GCGGACACCC CGGCGCGGCG 
82 751 CGGGCGTTCG CCGCGCGGCC AGGAGACCGG TTCGGACAGC AGGCGGACGC 
82 801 CACTGCCGTC CCAGTCCACG TGCGGCGTGG GCGCGTCGAT GTGCAGGGAG 
8 2 851 GCGGGCAGCT GTTCGTTGCG CAG CGCCATG ACCATCTTGA TCACACCGGC 
82 901 GACACCGGCC GAGGCCTGCG CGTGCCCGAT GTTCGACTTG AT CG AG CCG A 
82 951 GCCACAGCGG CCGGTCCGCG GGCCGCTCCT TGCCGTAGGT GGCGACGAGC 
8300.1 GCGCTGGCTT CGATGGGGTC GCCCAGCATG GTGCCGGTGC CGTGCGCCTC 
8 3051 CACCGCGTCG ACGTCCTCGG CGG AG AGCCG CGCGTTGG CG AGTGCCTGCC 
8 3101 GGATCACCCG CTGCTGCGCC TGCCCGTTGG GTG CCGTG AG CCCGTTGCTC 
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83151 GTGCCGTCCT GGTTGATGGC CGAACCCCGG ATCACCGCCA GGACGTTGTG 

83 201 GCCGTTGCGC CGGGCCTCCG AGAGCCGTTC GAGTACGACC AGGCCGACTC 

83 251 CCTCGGCCCA GCCGGTGCCG TCGGCGGCGG CCGCGAACGG CTTGCACCGG 

83 301 ' CCGTCCTTGG CGAGGCCGCG CTGCAGCGAG AACTCGACGA ACGAGCCCGG 

83 351 CGTGGCCATC ACCGTCGCGC CGCCCGCGAG AGCGAGCGAG CACTCGCCCT 

8 34 01 GGCGCAGCGC GTGCGCCGCC TGGTGGATCG CCACCAGGGA CGAAGAGCAG 

83451 CCGGTGTCGA TCGTCATGGC GGGGCCTTCT AGGCCGAGTA CGTACGACAC 

83 501 CCTGCCGGAG GCGACACAGC CGAGGTTGCC GGTGCCGATG TAGCCCTCGA 

8 3 551 CCTCGGTGGG CTGTTCACCG ACGAGCGCGA GGTAGTCGAA GATGGTCAGG 

8 3601 CCCGTGAACA CCCCGGCGTC GCTGCCCTTG AGGGTCTCCC GGTGGAGGCC 

83651 CGCGCGTTCG ATCGCCTCCC ACGCGGTCTC CAGGAGCAGC CGCTGCTG CG 

83 701 GGTCCATCGC GACGGCCTCG CGGGGGCTGA TGCCGAAGAA TCCGGCGTCG 

8 3751 AAGTCGCCCG CGTCGTAGAG GAACCCGCCT TCGCGCACAT AGCTGGTGCC 

83 801 GCGGCTCTCC GGGTCCGGGT CGTACAGCGT CTCCAGGTCC CAGCCCCGGT 

83 851 CGTCGGGGAA GGCCCCCATG GCGTCCTTGC CGGCCGCGAC CAGATCCCAC 

83 901 AG CTCCTCGG CGGAGCGGAC GTCGCCCGGA TAGCGGCAGG CCATGCCGAC 
8 3 951 GATCGCGATC GGCTCGTCGT CGGCGGCGCC CCTGGAGGCC CCGGCCGCCC 

84 001 GCACCGGGTC GGCGGAGGCC GCCGCGTCAC CGGACAGCTC GGCCCGCAGG 
84 051 ACGTCGGTGA GCGCGTCGGG GGTGGGGTGG TCGAAGACGA CCGTGGTCGG 
84101 CAGTGTCAGG CCGGTGCTCT TGTTCAGCCT GTTGCGCAGC TCCACCGCGG 
84151 TCAGCGAGTC GAAGCCCAGC TCCTGG AACG GCTTGGTGGC GGG CACCGCG 
84 201 TCGACGTCCG AGTGCCCCAG CGTGGCCGCC GCCTGGGAGC GCACGTGCTG 
84 251 CAGCAGCAAC TGCCGCTGCT GCGCCGGCTT CGCCTCCGTC AGCTCCTGCT 
84301 GGAGCGACGA TGCCTCCGTG GCGTCTTCCT GCTGTGCCGC GGGTGCGCTG 

843 51 GCCCG CGGGT TCTCGGGCAG ATCGGCGAGG AGCGGGCTGG GCCGCTGCGC 

844 01 GGTGAACGTC GACGTGAACT GCGCCCAGTC GAAGTTCGCC ACGGTCAGCG 
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84451 TCGTCTCACC CGCGTCCAGG GCCTGCTGCA GCGCCTTGAC GCACAGCTCC 

84 501 GGGCTGAGCG GGTGCAGGCC GAAGCGGCTG AAGAACGTCA ACGCGGCCTG 

84 551 GTCCGCCGCC ATGCCCGCCT CGGCCCAGGG CCCCCAGGCG ATGGAGGTGG 

84 601 CGGGCAGGCC CTCGGCGCGG CGGTGCTCGG CGAGGGCGTC GAGGAAGTGG 

84 651 TTGGCCGCAC CATAGGCGCC CTGCTGGCCA CTGCCCCACA CGCCTGCGCC 

84 701 CGACGAGAAC ATCACGAACG CCGAGAGCGG CAACTCCCGG GTCAGTTCAT 

84 751 G C AG ATGGTG AG CGGCG AG C GCCTTCGGAC GCAGCACCTC GTCCAGCTCG 

84 801 GCACCCGACA CGTCGCCGAG AC CG ATGTAG TTCGGCACGC CGGCCGCGTG 

84 851 GATGACGGCG GTCAGCGGGT GCTCGGCGGG GACATCGTCG ATGAGGCGTC 

84 901 GCACCTGCTC GCGGTCGCCG ACGTCGCAGG CGGTGACGGT GACGGCGGCC 

84 951 CCCAACTCCG TCAGTTCCGC GGCGAGTTCC TGTGCTCCCG GGGCGTCGGG 
8 5 001 GCCGCGGCGG CTGGTCAGGA GGAGGTGCGG GGCGCCCGCA CGGGCGAGCC 
85051 ACCGCGCGAG GACGGCGCCG ATGCCGCCGG TCCCGCCGGT GATGAGAGTG 
8 5101 GTGCCGTCGG GCCGCCAACC AAGCCCGCTG GCGACCGTGT TGGCGGGCGC 
8 5151 GTGTGCAAGG CG ACGGG CAT GGACGCCGGA CGG CCGGATG GAGATCTGGT 
8S2 01 CCTCGTCCTG CGG AAC CAG C GCGGCGGCCA GCCGGG CC AG CGTCTGATGG 

85 251 TCGATACGAG CGGGCAGATC GACCAGCCCG CCCCACAGCC GCGGATACTC 
85 301 CAG CG CAG CG ACGCGCCCCA GCCCCCACAC CTGAGCCTGC AC CGGGTGGG 
85351 TGAGGGCGTC GCCGGCGCTC GTGGAAACAG CCCCCTGCGT GAG AG TGCGT 
854 01 ACGGCGATGT CGGCGCCGTT GTCCGCGAGG GCCTGGACGA GAGCGGTCGT 
854 51 CGCGGCGAGT CCGGCGGG C A CGGCCGAGTG CTCGGGATGC GGCTCCTCGT 
85501 CCAGGGCCAG CAGATTGACG ACTCCGGCAA ACGCGGCCCC GTCCATCAGG 
85551 ACACGCAGCT CCTGCGCCAA CTCCGTACGC TCCATGG C AC GTGCGTCGAC 
85601 CACGTGGCGT CGCACCTCGC CACCATGGGC GGTCAGCGTC TGCGCGGTCG 
85651 CGAGGACGGC CGGGTGGTCG GCGTGCGCGG CGGGCACGAG CAGCAGCCAG 
857 01 GCCCCGCTGA GCTCCGGCGC CGG C ACGTCG GGCAGATGCT TCCAAGTGAC 
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8 5751 CTGATAACGC CAGGAGTCGA CGGTGGACTG CTCGCGGTGC CGACGCCGCC 
8 5801 AGG C CG AG AG GACGGGCAGC GCGGACTCCA GCGCTCCGAC GCTCTCCGCC 
85851 TGCCCCTCGA TCTCCAGACT GCCGGCGAGG GCGTCGATGT CCAGGTCCTC 
85 901 GATCGCCTGC CACACCCGGG CCTCGACCGG ATCGTGCCCA CCACCCACGG 
8 5951 CTGCGACCGC CGCGGGCGGC TCCACCGAGT AGTGCTTGTG CTGGAAGGCG 
86001 TAGGTGGGGA GGTCGACGGT ACGGGGGGTG GGGTCGGCCG GGAACCAGCG 
86051 GCGCCAGTCG ACGGGGGCGC CGGCGGTGAA GGCGTGGGCG GCCGCGCGGG 
86101 TGAGCTGGGT GGTGTCACCG TGGTCGCGAC GCAGGGTGGG GATGGTGACG 
86151 GCCGTCCCCG CAGCACCGGC CTGCTGCTCG ATGGTCTCCT GGATGCCGAG 

8 62 01 GTTGAGGACG GGGTGGGGGC TGGCCTCGAT GAACAGGCGG TAGCCGTCGG 

8 6251 CCAGCAGCGC TTCGATGGTG TCGGCGAAGC GGACGGGCTG GCGGAGGTTG 

86301 GTGACCCAGT AGGCGGTGTC TAGGGCGGTG GTGTCGTCGA GGCGCTCTGC 

8 6351 GGTGACCGTC GAGTAGAACG CCACGTCGGT GGTGGTCGG C TGGATGTCGG 

86401 CGAGCCGGTC GGTGAGGAGG TCGTGGAGCT GGTCGATCTG GGGACCGTGG 

8 6451 GAGGCGTACC TGACGTCGAT GACGCGGGCC CTGAGTCCCT GCGCCTCCGC 

86501 ATCGG CGACG ACGGCTGCCA CATGCTCCGG CGGGCCCGAA ATCAC GGTCG 

86551 ACGACGGTCC GTTGACGGCC GCGACGACTA CGCCCGGCCG GTCGCCGATC 

86601 AGCTCTGCGG CCTGCTCGGC ACCGGTGCTG AG CG AGG CCA TGTCGCCGTG 

86651 CCCTTGCAGC TGACGAAGCG CGTCGCTGCG TACGGCTACG ATCCGTGCCG 

8 6701 CATCCTCCAG TGACAGTGCC CCCGCCACAC ACGCGGCAGC CATCTCGCCC 

86 751 TGCGAGTGCC CGATGACGGC AGCCGGGGTG ATGCCGTAAT CGGCCCACAC 

8 6801 CGCAGCCAGC GAGACCATCA CCGCCCACAG CACGGGCTGC ACGACCTCGA 

86851 CCCGGGACAG CTCGCTCCCG TCCCCGCGCA AGACATCACT CAGCGACCAG 

86901 TCCACATGCG CCGACAGCGC CTGCTCACAC TCCGCGATCC GCGCCGCGAA 

86951 GACGGGCGAC TCGTCAAGGA GCTGGGCGCC CATGCCCACC CACTGCGACC 

87001 CCTGCCCCGG AAACACCAAC ACCGGCCCAG GACCCACATC ACCGGCCACC 
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87 051 CCGGCCACCA CATCCGCCGA CGCCTCACCC GCGGCCAATG CCTCCAGGCT 
8 7101 GGCACCAGCC TGAGCCAAGT CCCGCCCCAC GACCACGGCC CGCTGATCGA 
8 7151 ACAACGCGCG TGTCGTGGCC AGCGACCAGC CCACCTGGGA GACCGACGCA 

872 01 TCCGCCAGCC CGGCCGCGAA CTCGCCCAGC CGCCGCGCCT GTTCACGCAA 
8 7251 CGCGTCCGGC GTCCGCCCGG ACACCACCCA CGGCACGACC CCACCCGGCT 

873 01 CAGCCGCCAC GGGGCCCGGC GCGTCCTCTT GCGGCGGCGC CTCCTCCA*GA 
87351 ATCAGGTGCG CGTTCGTCCC GGAGATC CCG AACGCCGAGA TGCCTGCCCG 
87401 CCGCGTGCGC TCCGCCGGCC AGTCCACGGG CTCGGAGAGC AGTCGTACGC 

874 51 TGCCCTGTTC CCACTGGACG TGCGGTGACG GGGCGTCGAT GTGCAGGGAG 
87501 GTCGGGAGGA GACCGTTGCG CATCGCCATG ACCATCTTGA TGACGCCCGC 
87551 GACACCGGCG GCGGCCTGCG TGTGGC CG AT GTTGGATTTC ACCGAGCCGA 
87601 GCCAGAGCGG ACGGTCCTCC GGGCGCCCCT GGCCGTACGT GGCGATCAGG 
8 7651 GCCTGCGCCT CGATGGGGTC GCCGAGCGTG GTGCCGGTGC CGTGCGCCTC 
87 701 TACGGCGTCG ATGTCCTCGG CGGAGAGGCG GGCGTTGGCG ACGGCCGCGC 
87751 GGATGACGCG TTCCTGGGAG GGGCCGTTGG GGGCGGCGAG CCCGTTGCTC 
878 01 GTACCGTCCT GGTTGGTGGC CGAACCCCGT ATCACCGCAA GGACCTTGTG 
8 7851 GCCGCGGCGC CGCGCTTCGG AG AGC AG CTC CAGCGCCACC ACCCCGGCGC 
8 7 901 CCTCGCCCCA GCCGGTGCCG TCGGCGGCGG CCGCGAACGG CTTGCACCGC 
87 951 CCGTCGGGCG CGAGCCCCCG CTGCCGGGAG AACTCGGTGA ACGAACCCGG 
8 8001 CGTCGCCATC ACCGTCGAAC CGCCCGCCAG CGCGAGCGAG CACTCGCCCT 
880S1 GCCGCAGCGC CTGACTTGCC AGATGGATCG CCACCAGGGA CGACGAGCAC 
B8101 GCCGTGTCGA CGGTGACCGC GGGACCTTCG AGCCCCACCG TGTAGGAGAT 
88151 CCGGCCCGAC ACCACACTGC CGAGG TTG CC GGTGCCGATG TACCCCTCGA 
8 82 01 CGTCGCTGGC CGTCTGGCTG ATCAGCGTCA GGTAGTCGTG GGCGCTCACT 
8 8251 CCGGTG AAG A CGCCGGTGTC GCTGCCCTTC AGCGCGTGCG GGTTCATGCC 
883 01 CGCGTGCTCG ATCGCCTCCC ACGCGGTCTC CAGGAGCAGC CGCTGCTGCG 
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8 8 351 GATCCATCGC CGTGG CCTCG CGCGGGCTGA TGCCGAAGAA CTCGGCGTCG 

8 8401 AAATGGCCGG CGTCGTACAG GAAGGCGCCG TCCCGCACAT AGCTGGTGGC 

88451 CGGATGCTCC GGATCCGGGT GATACAGCGA CTCCAGGTCC CAGCCCCGGT 

88501 CGTCGGGGAA CCCCGCGACC GCGTCACCCC CGTCGCGCAC GAGTTCCCAG 

8 8551 AGGTCCTCCG CCGACCGGGC GCCGCCCGGG TAGCGGCAGG CCATGCCGAC 

8 8601 GATGGCGACC GGCTCGGTCG ATTCCTTGTC GTGGAGCCGT TG CCGG G C*CT 

8 8651 GGCGCAGCTC CGCGGTGACC CACTTGAGGT GATCGAGAAG CTTCTCCTCG 

887 01 TTCGACATCT GACCCAGGCT CCTTGGCGCT ACGTGGTGAT CGGGGCGTAT 

88751 GAGGTTGGGG GAGGGCAAGG GGG CCGGTGT GGCCGGGGCT CATCGCGCTC 

88 801 AGGACTGATC GCTGCTCAGG ACTTCCCGAA CTCACTGGAG ATGAGGTCGA 

88 851 AGATGTCGTC CGCGCTCGCC GCCTCCAGAT CGGCATGGGC CGAATCAGTG 

88 901 CCTTCCGGCC CGTCCTGCGC CGGACTCCAC TTCGACACAA GGACCTGCAG 

8 8 951 CCGGCCCACG ATGCGGCGCC GGGCCGCCTC GTCCACCTCG GCCGCTCCGA 

89001 ACGCCGTGTC CCACTTGTCG AGCGCCGCGA GCACGTCGCC CTCACCTGCG 

89051 ACCTCGGCGC CGTCGCCGAG CTGTCCGCGC AAGTG CGTGG CGAGGGCCTC 

8 9101 GGGCGTGGGA TGGTCGAAGA TCACGGTGGC GGGGAGCGAG AGTCCGGTCG 

8 9151 TGGTGTTGAG CTGGTTGCGC AG CTGG AC CG CGGTGAGCGA GTCGAAGCCC 

8 9201 AG CTCCTGGA ACGGCTTCGC GGCGGGAATG TCCTCCACCG TGCGGCCGAG 

8 9251 CGTCGCGGCC GCGTATGTCC GGACCTGCTG GACCAGGAAG CCGAGCCGCT 

8 9301 GTGATGCGGG CGTCTTCGCC AGCTCCTCGC GGAACGCGCT CGTCTCG G CG 

8 9351 GCGGTCCCCG TCTGCTCGGC CTCCCGCTGG TTCTCCGGAA GG TCGTCG AG 

8 9401 GAACGGACTG GGCCGCTGCG CGGTGAACGT CGGCGTGAAC TTCGCCCAGT 

8 9451 CGAAGTTCGC CACGGTCAGC GTGGCGTCGC CCGCGTCGAC CGCCTGGTGC 

8 9501 AGCGCCTTGA CGCACAGATC CGGAGCGATC GGGAGCAGAC CGAAGCGCTT 

8 9551 GAAGTACGTC AGTGACTCCG GGTCGGCGGA CATGCCCGCC TCGGCCCAGG 

8 9601 GCCCCCAGGC GATGGAGGTG GCGGGCAGGC CCTGGGCGCG GCGGTGCTCG 
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89651 GCGAGGGCGT CGAGGAAGTG GTTGGCCGCA CCATAGGCCC CCTGCTGGCC 
8 9701 ACTGCCCCAC ACGCCGGCGC CCGACGAGAA CATCACGAAC GCCGAGAGGT 

8 9751 CCAGGTCGCG CGTCAACTCG TGCAGGTTCC AGGCCGCGTC GGACTTCGAC 

8 9801 CCCAGCACCT CGCCGAGGCG CGCGGTCGTC AG AT C AC CG A TCGCGGTCAG 

8 9851 ATCGGTCATG CCCGCCGCGT GGATGACGGC TGTGAGGGGA TGCTCGGCCG 

8 9901 GCATGTCGTC GATGAGGCCG CTCAGTTGGC GGGGATCGCT GACGTCGCAG 

8 9 9S1 GCGGTGATGG TGACGGCGGT GCCGAGCCCG TCGAGCTCGG CGGCGAGTTC 
90001 CCGGGCGCCG GGCGCGTCGG GCCCGCGACG GCTGGTGAGG TGAAGACGGG 
90051 GGGCGCCCTG CCGGGCCAGC CAACGGGCGA GGACGGCACC GATGCCGCCG 
90101 GTCCCGCCGG TGATCAGGGT GGTGCCCCGA GGCCGCCAGG TGGCCTCGCT 
90151 GTGCACGGGA TTCTGAATGC TTCCGACGGC GTGCGTGAGG CGCCGGTGGT 
90201 GGATTCCGGT GGGGCGGACG GCGGTCTGGT CCTCGTCGTC CTGGGGGAGG 
90251 AGAGCGGCGG CGAGG CGGGG GAGGGTGTGG CGGTCGATAC GAGCGGGGAG 

9 03 01 GTCGACGAGT CCGGCCCAGA GGCGCGGGTG TTCGAGGGCT GCGACGCGGC 
90351 CGAGCCCCCA GACGTGAGCC TGGAGGGGGT GGGTGAGTGG GTCGGTGGCG 
90401 GCCGTGGACA CGGCACCCTG CGTGACGGTG TGCAGGGGTG CGGTCGTGCC 
90451 GTTGTCGCCG AGGGCCTGGA GGAGAGCGGT CGTCGCGGCG AGCCCCGCGG 
90501 GCACGGCGGG GTGCTCGGGG TGCGGCTCCT CGTCCAGCGC CAGCAGATTG 
905 51 ACGATTCCGG CAAGACCGGC CGTGTCCACC GCGGCCAGCT CCTGACGTCC 
90601 CGCCCGGCCG GTCTCGACCG GATGCAGCCG GACGGCGGCC GCCCCGTGCT 
90 651 CGCTCAACGC CTCGGCGGTG GCTCGTACGG CGGGGTGCTC CGCCTTGTCG 
90701 GCAGGGACGA ACAGCAGCCA GTCGCCGCCG AGTTCCGGTG CGGGCCCGTC 
90751 GGACCGCTGT TTCCACGTGA CGCGGTACCG CCAGGAGTCG ATGGTCGCCT 
90801 GGTCCTGGTG CCGACGCCGC CAGCCCTTGA GCACCGGCAA CGCGGGCTCC 
90851 AGCGCC CGG A CCGCCTCCTC GCTGCCCTCC TCCGACCCCA GCGTCTCGGC 
90901 CAGCAGACCG AGATCGAGCT CCTCGACGGC GTGCCACAGC TGGGCCTCGG 
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909 51 CCGCACTCTG CTCACCGCTG ACGGCGCCCG AGGCGGACGC GG AACGTTCG 

910 01 AGCCAGTAGT GCTGGTGTTG G AAGG C G TAG GTGGGGAGGT CGACGGTGCG 
910 51 GGGGGTGGGG TCGGCCGGGA ACCAGCGCCG CCAGTCGACG GGGGCGCCGG 
91101 CGGTGAAGGC GTGGGCGGCG GCACGGGTGA GCTGGGTGGT GTCGCCGTGG 
91151 TCGCGGCGGA GGGTGGGGAC GACGGTGGCG GGGATGTCCG CCTGCTCGAT 
912 01 GGTCTCCTCC ATGCCCAGGC CCAGCACGGG GTGGGCGCTG GCCTCGAffeA 

912 51 ACAGGCGGTA GCCGTCCGCG AGAAGGGCTT CGATGGTGTC GGCGAACCGG 

913 01 ACCGGCTGGC GGAGGTTGGT CACCCAGTAA TCCGTATCCA GGGCTGTGGT 
91351 GTCCGTCAGA CGCTCGGCGG TGACCGTCGA ATAG AAGG C C ACGTCCGTGT 

914 01 TCGCGGGCCG GATGTCAGCC AGGCGTTCGG TCAGCAGATC GTGGAGCTGG 

914 51 TCGATCTGGG GGCCATGCGA GGCGTACCCG ACGTCGATGA CACGGGCGCG 

915 01 C AG AC CACGT GCCTCCGCAT CGGCGACCAC GGCAGCCACA TGCTCCGGCG 
91551 GCCCTGAAAT CACCGTAGAC GACGGCCCAT TGACCGCCGC GACGACCACG 
91601 CCCGGCCGGT CACCGATCAG CTCAGCGGCC TG CTCGGC AC CGGTGCTCAG 
91651 CGAGGCCATG TCACCGTGCC C TTGC AG CCG ACGAAGCGCG TCACTGCGTA 
917 01 CGGCTACGAT GCGCGCCGCA TCCTCCAGCG ACAGCGCCCC CGCGACGCAC 

917 51 GCGGCAGCCA TCTCACCCTG CGAGTGCCCG ATCACAGCAG CCGGAGTGAC 
91801 CCCGTAATCA GCCCACACCG CAGCCAGCGA GACCATCACC GCCCACAACA 

918 51 CCGGCTGCAC GACCTCGACC CGGGACAGCT CACTCCCATC CCCGCGCAAC 
91901 ACCGCACTCA GCGACCAGTC CACATACGCC GACAGCGCCC GCTCACACTC 
91951 CGCAATCCGC GCCGCGAAGA CGGGGGACTC GTCCAGCAGC TGGGCACCCA 
920 01 TGCCCACCCA CTGCGACCCC TGCCCCGGAA ACACCAACAC CGGCCCAGGA 
920 51 CCCACATCAC CAGCAACCCC GGCCACCACA CCCGCCGAAG CCTCACCCGC 
92101 AGCCAACGCC CCCAGGCCAG CCGTCAACGC ATCGCGGTCA CGCCCCACCA 
92151 CCACAGCCCG G TGCTCG AAC ACCGACCGGG TCGTGGTCAA CGACCAGCCC 
922 01 AC ATCAG CCG CCGACGCATC CGCCGGCCCG GCCGCGAACT CGCCCAGCCG 
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92251 CCGCGCCTGT GCACGCAGCG CGTCCGGCGT CCGCCCGGAC ACCACCCACG 

923 01 GAACGACCCC ACCCGGCTCC TCGGCCACGG AGCCCGGCAC GTCCTCCTCC 
92351 TCCGGTGGTG CCTCCTCCAG GATCAGATGC GCGTTCGTCC CCGAGAAGCC 

924 01 GAACGAGGAC ACCCCCGCCC GGCGCGGGCG CTCGCCCCGG GGCCACTTCA 
924 51 CCGGGTCGGT GAGCAGGCGC AGCCCGCTGC CGTCCCACTC CACGTGGGGC 
925 01 GAGGGGGCGT CGACGTGCAG GATGGCGGGC AGCAGGTCGT GCCGCAGGOC 
92 551 CAGGACCATC TTGATGACAC CGGCCACACC GGCGGCGATC TGCGTGTGGC 
92 6 01 CGATGTTGGA CTTCACCGCT CCCACCCACA GCGGCCGGTC CTCCGGCCGT 
92 6 51 TCCCGGCCGT AGGCGGAGAT GAGAGCCCCG GCCTCGATGG GGTCGCCGAG 
92 7 01 CGTGGTGCCG GTGCCGTGCG CCTCCACGGC GTCGATGTCC TCGGGGGCGA 
92 751 GGCGGGCGTT GGCGAGGGCG GCGCGGATGA CGCGTTCCTG GGCGGGGCCG 
92 801 TTGGGGGCGG TCAGGCCATT GCTCGCGCCG TCCTGGTTGA TCGCCGAACC 
92 851 CCGGATCACC GCGAGGACCT TGTGGCCCTT CCTGCGGGCG TCGGAGAGAC 
92 901 GCTCAAGGAG AACCACCCCC GTACCCTCCG CCATGCCCAT GCCGTCGCTG 

92 951 CTCGCCGAGA ACGGCTTGCA CCGTCCGTCG GGGG CCAGGC CGCGCAGTTC 

93 001 GCTGAAGCCG ATCAGCGGGG CGGGCGACGA CATCACGTAC GTGCCGCCCG 
93 051 CCAGCGCCAG CGAGCACTCC TGTGTGCGCA GGGCCTGGGT GGCGAGGTGA 
93101 AGGGAGACCA GCGACGAGGA GCACGCCGTG TCGACCGTCA CCGCGGGGCC 
93151 TTCGAGGCCC AGGGTGTAGG CGACGCGGCC GGAGGTGACG CTGCCGGAGT 
93 2 01 TGCCGATGGT GAAGTATCCG GCGGTGCCCT CGGGGACCTC GGACGCGCCG 

932 51 AGGGCGTAGT CGAGTCCGTC AC AG CCG ATG AAGGTGCTGG TGTCGCTGGA 

933 01 GCGGAGGCTG AGGGGG TCG A TGCCGGCCCG TTCGATCGCC TCCCACGCCG 
93351 TCTCCAGGGC GAGCCGCTGC TGCGGCGCCA TGGCCGCGGC CTCGGTGGGT 

934 01 CCGATGCCGA AGAAGGTGGG GTCGAAGTCA CCGGCGTCGT AGACGAAGCC 
934 51 GCCTTCCCGG ACGTAACTGG TGCCGGTGCT C TCGGGG TCC GGGTCGTAGA 
93 501 GGGAATCGAG GTCCCAGTTG CGGTTGCCGG GCAGGGGCGC GACGGCGTCG 

-72- 



SUBSTfTL?TE SH-EET (RULE 26) 



1 1 SEPTEIv5BER 200D 

93551 CCGCCGGTGG AG AC CAGCTC CCAGAACTCT TCGGGAGACC GGACTCCGCC 
93601 GGGC AG CCGG CAGGCCATGC CGATGACCGC G ACCGG TTCG TGGCCGGCCG 
93651 ACTCGACGTC CTGCAGCCGG CGTTCCGTCT GACGCAGGTC CGCGGTGACA 
93701 CGCTTGAGGT ATTCCAGAAG TTTCTCTTCG GTGTGCGCCA TCCCGGTG AC 
93751 AACCGCCCCT CTCCGCGAGA ACAGACCGCA GACTCGTCGA CGG CGCTAAA 
93 801 GCCCTCCTAA TACTCGGCTG TGTACCGCTC GCTGCCACGG GTGTCCGCAC 
93 851 TGGTCGGAGG CTCCGGCCCA GGGAACAGGG GCTTTCTTAG GGGCGCTTAA 
93 901 GCGGTGCCTG CCAGGGTGTG CCGGTGTCAG GCCGTCACGC CCTGATCAGC 

93 951 GGCGTCGCCC GTGCCGTGCC CGTGCGGTCG GTGGGCCTGA CCGTCGGTCC 

94 001 GGACAACGCG AAGCGAGGCA TCGTGC CC AT CACGGATAGC AAGCCGGCCG 
94 051 CCACATTCCC CGACCTGGTC GACCCGTCGT TCTGGGCGCG GCCGCACGCG 
94101 GAACG CGTGG CGCTGTTCGA GGAGATGCGC GGGCTGCCGC GGCCGGCGTT 
94151 CATCCGGCAG AACATGCCCG GCGTGCCCTG GACGTTCGGC TACCACGCGC 
94 2 01 TGGTCAAGTA CGCGGACATC GTGGAGGTGA GCCGCCGCCC GCAGGACTTC 
94251 TCCTCGAACG GCGCGACCAC CATCATCGGT CTGCCGCCCG AGCTGGACGA 
94301 GTACTACGGC TCGATGATCA ACATGGACAA CCCGGAACAC TCGCGGCTGC 
94351 GGCG CATCGT CTCGCGTTCC TTCGGCCGCA ACATGATCCC CG AG TTCG AG 
94 4 01 GCCGTGGCGA CCCGCACCGC CCGCCGCATC ATCGACGAGC TCATCGCGCG 
94451 GGGACCCGGC GACTTCATCA GGCCCGTCGC CGCGGAGATG CCCATCGCCG 
94501 TGCTCAGCGA CATGATGGGC ATCCCGGCGG AGGACCACGA CTTCCTCTTC 
94551 GACCGGTCCA ACACGATCGT CGGCCCCCTC GACCCGGACT ACGTGCCGGA 
94601 CCGGGCGGAC TCCGAACGCG CGGTGATC G A GGCGTCACGC GAACTCGGCG 
94 651 ACTACATCGC TGGCCTTCGT GCGGAACGGC TCGCCGCCCC CGGCAACGAC 
947 01 CTCATCACCA AGCTCGTGCA AGTCCAGGCG GACGGCGAGC AGTTGACGCG 
94751 GCAGGAACTC GTCTCCTTCT TCATCCTGCT CGTCATCGCC GGGATGGAGA 
94801 CCACCCGCAA CGCCATCTCG CACGCGCTGG TACTGCTGAC CGAGCATCCC 
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94 851 GAGCAGAAGC AGCTGCTGCT CTCGGACTTC GACACGCACG CGCCGAACGC 

94 901 GGTCGAGGAG ATCCTCAGGG TCTCCACGCC CATCAACTGG ATGCGGCGCG 

94 951 TCGCCACCCG CGACTGCGAC ATGAACGGCC ACAGGTTCCG CAGGGGCGAC 
950 01 CGGATCTTCC TGTTCTACTG GTCGGGCAAC CGGGACGAAT CCGTCTTCCC 
95051 TGACCCGTAC CGGTTCG AC A TCACGCGCGG GACGAACGCG CACGTCACGT 
95101 TCGGCGGGGT GGGCCCGCAC GTCTGCCTCG GGGCCCACCT CGCCCGTA'TG 
95151 GAGATCACCG TCCTGTACCG GGAGCTGCTC GCGGCGCTGC CCCAGATCCA 
952 01 TGCCGTGGGG CAGCCCCGCA GGCTGGACTC CAGCTTCATC GAAGGGATCA 

952 51 AGCACCTGCA CTGCGCCTTC TGAGCACATA CGCTTCCCTC TGCGCATGTG 

953 01 CGCTCACGAC GCTCCGATCA GCGACTGCCA ACGACTGTCA GCGACCGGAC 

953 51 AGGGCCAAGG GCGGTGGGGA CATCAGGTGC ATGTCACCCG CGAGTATGGC 

954 01 CCGCTGCAGC TCCTGGAGCG GGCGCCCGGG TTCGAGCCCC AGCTCGTCGT 

954 51 TGAGCGTCTT GCGCACCGAC TGGTACACCT TCAGCGCGTC CGCCTGCCGC 
95501 TCGGAGCGGT AGAGCGCCAG C ATCAG CTGG CGG TAG AACG CCTCGCACAT 

955 51 CGGGTTCTCC GCGGTGAGGG CGTACAGCAT GCCCACGGCC TCGCGGTGCC 
95601 GGCCGAGCTG GAG CTGG C AC TCGACGAGCA TCTCCTGACA CTCCAGGCGG 

95 651 ATCTCGGTCA GCCAGGTCGA GAAGCCGTCG ATG ATCGGG C CGTTGGTGCC 
95701 GGGACCGTTC CCGCCCTGCC CGAGGATCGG GCCGCGCCAC AGCGCGAGCG 
957 51 CCTGCCCGAA ACAGGAGGCC GCCTCGTCGA ACCGCTTCTC CCTGAGCAAC 
95801 GACCGCCCCA CGTCCACCAG TTCGGGGAAG ATCTGGGCAT CGATCTGGTC 
95 851 GTCGTCCCGC TTGTGCAGGA CGTACCCCGG CGCACGGGTC TCGACGGGGT 
95901 TGCCCGCCGA ACCGGGCACC TTGAGGAACT TGCGGAGCTG GGAGATGTAC 
95951 ACATGCAGTC CCGCCGTGGC GCGCCGCGGC AGGTCCTCGC CCCAGATCTC 
96001 CCGCATCAGC TGCTCCAGGG AGACCACCCG GTCGGCGCGG ATG AGG AG CA 
96051 CGGTGAGGAC GATCTCCACC TTCTGGG CGT TGATGGTGGC G T AGTCGTTT 
96101 CCGTCCTTGA TGCGGAGCGG GCCCAGCATT TCGTATCTCA CCGAGCGTTC 
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961S1 CCCCTTGCTG TCGCACGCTG CTGCGCACTG TCGGCCAGGG CCTTGGAGAT 

962 01 GACTTCCGTG ACGCCCTGCT GGTGCGTGTT CAGATAGAAG TGGCCCCCGG 
96251 CGAAGACCTT GAGGTCGAAG GGGCCTTCCG TGTGCTGCTG CCAGGCCTCG 

963 01 ACCTCGTCCA GCGGCGCCTG CGGGTCCCGG TCGCCCACCA GGGCGGTGAT 

963 51 GGGGCAGGAC AGCGGCGGCG ACGGGTTCCA CCGGTAC AG C TCGACCGCCC 

964 01 GGTAGTCGTT GCGGACGACC GG G ATG ATC T CCGCGAGCAG TTCCTCGfbG 
96451 TCCAGGAACC GCGGGTCAGT GCCACCGGCC CGGCGCAGCT CGGCGGCCAA 

965 01 CTCGGTGTCG TCGAGGAGGT GTACGGTGCC GCGCCGGAAG CGGGACGGCG 
965 51 CGCGGCGTCC CGAGACGAAC AGCCGGCAGG GCTGCTTCCC CGTGCGCTCG 
96601 CGGAGCCGCT GGGCGACTTC GTAGGCGAGG ACGGCGCCCA TGCTGTGGCC 
96651 GAAGAACGCC AACGGGCGGT CGTCGAACGG GCCGAGCGCA TCGGTGATGA 

967 01 GGTCGGCGAG TTCCCCGATG TCGTCCAGGA GCCGCTCTCT GCGGCGGTCC 
96751 TGTCGCCCGG GGTACTGCAC CGCGAGGACC TCGCTGTCGG TCGGGAGAGT 

968 01 GGGGGATTGC GCAAGGGGGT GGTAGTAGGA GGCCGAGCCG CCCGCGTGGG 
968 51 GG AAG C AG AC CAGGCGAACG ACGGCTTCCG GTCGGGGCCG GAAGCGACGT 
96901 ATCCAAGGGT CCGACATATC GGGTGGGGGG AAGGCAGACA AGATCTTTCC 
9 69S1 CTTCGCCAGG AACGCTGACA ACGGTGTGTC GCCACATCAC ATAGCCG CTC 
970 01 CTGATCATGC GCAGCTCAAA GTTTAAACGG CAACGTCGCT AACGGGGGAG 
970 51 CAGGGCGGAA T C AG AC ATTC CCCATCCTTT ATTCCGCGAT TCTTACGTGA 
97101 TCGAATCCCG GCGGCCAAGA TGGAGTAAAT TTCAATATGA ATGCTTAACG 
97151 CCGCACAGCT TGTACGGCGG GCCGCCCGGG CGGTGACTGG CGTCCCTGCC 
97201 AGCCGTGATG GCCTGACGAG GCCTC CGGG A TCCATCCCCC GCCCGCTGTC 

972 51 GCCGAGTTCT TTGCGGGATT ATTACGTTGC ATTGGTTTGC TTCGTGGCCC 

973 01 GGGCCGTTGG CCTGCGCTAT TTGGCAGCCT TCCGTCATGG GTGGTAAAAG 
97 351 ATCGCCTTTC CCCTCTGGGG TGCCGGTCGA GCTGGCCTCG ACCGCGATTG 

974 01 TGGCTTGTTG TTTTCTTGTG GCGCCGCGTG TGAAACAGCG GCAGTTGGCC 
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97451 ACTCGCTCTG ACAGG CTCCG GGGACGGGGT TGTCACCTTT TGGGGTGACT 

97501 GGCCTCGTTC AAGGCGTCCT GGCCCGTGGT GCATCCGCGA TCGTCGTGCC 

97 551 ATGGGTGAAG TGGGAAGGAG CACAGAACGA TGAGCGAGAG CATGGCGTGG 

97601 CTGACGCGGG ACGTCCGCAA GGCCCGCAAG GAGGGCAGTG CGGGGACCGC 

97651 GCGGCGCCGA GCCGACCGGC TGGCGGACCT GGTCGCCCAC GCCCGCTCGG 

97701 CGTCGCCGTA CTACCGGGAG CTCTACCACG GCCTGCCCGA GCGGATCG^G 

97751 GACCCGACGC TGCTGCCGGT GACGGACAAG AAGCAGCTGA TGGACCACTT 

97801 CGACGACTGG CCGACGGACC GCGACATCAC CTTCGAGAAG GTCCGCGCGT 

97851 TCACCGACGA CCCCGAGCTG ATCGGGCGGC GCTTCCTCGG CCGCTATCTG 

97 901 GTGGCCACCA CGTCGGGCAC CAGCGGCAGG CGCGGCCTGT TCGTGCTCGA 

97 951 CGACCGGTAC ATGAACGTGT CCTCCGCCGT CTCCTCCCGG GTGCTCGCCT 
98001 CCTGGCTCGG CCCCCTCGGC ATCGCCCGGG CCGTCGTCCA CGGCGGCCGC 
98051 TTCGCCCAAC TCGTCGCCAC CGAGGGACAT TACGTCGGCT TCGCCGGATA 
98101 CTCCCGCCTG CGCCAGGACG GCGGAGCGCG CAG CAAGCTC GTCCGCGCCT 
98151 TCTCTGTGCA CGAGCCGATG TCACGTCTGG TCGCCGAACT C AACG AG T AC 
98201 CGGCCCGCGT TCGTCATCGG CTACGCCAGT ACGATCATGC TGCTCACCGC 

982 51 CGAACAGGAA GCGGGCCGGC TGCACATCGA CCCGGTGCTG GTCGAGCCCG 
93301 CGGGCGAGAC GATGACCGAG AGCGACACCG ACCGCATCGC TGCGGCGTTC 

983 51 GGCGCCAAGG TGCGCACGAT GTACAGCGCG ACCGAGTGCA CCTACCTCAG 

984 01 CCACGGCTGC GCCGAGGGCT GGTACCACGT CAACGACGAC TGGGCCGTGC 
984 51 TCGAACCGGT CGACGCCGAC CACCGGCCCA CCCCGCCGGG GGAGTTCTCG 
98501 CACACCACCC TGATCAGCAA CCTCGCCAAC CGCGTCCAGC CGTTCCTCCG 

98 551 CTACGACCTG GGCGACAGCG TCATG CTCCG CCCCGACCCC TGCCCCTGCG 
98601 GCACCCCCTC GCCCGCGATC CGGGTCCAGG GCAGGTCGGG CGACATCCTC 
98 651 ACCTTCCCCT CGGGCCGGGG CGACGACGTC AGCCTCGCCC CGCTCGCCTT 
987 01 CAGCAGCCTC TTCGACCGCA TGCCCGGAGT CGAGCTCTTC CAGATCGAGC 
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98751 AGACCGCGCC GTCGACCCTG 

9 8801 GACGCGGACC ACGTGTGGCA 

9S8 51 CGCCGACAAC AAGCTCGACA 

98 901 CGCGGCAGGC ATCCGGCGGC 

98 951 TGAACGCTCG CCGACTAGCC 
990 01 TACGGGCGCA GCGGAGGCTC 
99051 ' CAGCTCGATC GGGAAGTTCA 
9 9101 GGCTGTTGAG CGG CATGACC 
9 9151 CTCGTGTCGG CGGGACCGGG 
99201 GCTGACCTGG TCGACGACCA 

992 51 GACAGATGTC CTGGACGCCG 

993 01 TCCCCGTCGG CCACCACGGA 
99351 CGACGGTCCT GCGGGCATGG 

994 01 AGTTCGAGTC GCGGGCGAAC 

994 51 TCGGCGATCG GGCGGCAGGG 

995 01 GCCCAGGGTC ACCATGTCGT 
99551 GCAGGGCCCA CGCCGTGAGG 
99601 ACCTTCCGGC CGGTGGC CTC 
9 9651 CTCGACGGAC TCCTGCATGT 
99701 AACAGGACTG GTAGCCCTTC 
99751 TAGTTCTCCT CGCCCTTGAG 

99 801 CTTGTCACCG GCGTCACGCA 
99 851 CCTTGGCGAG CTCGGCCGCC 
99901 GCCGCCGAAG CGGCGGAGGC 
99 951 CACGAGTCCG CCGAGCCATG 

10 0001 CCTTCACGAG TGAGCGGAAA 



CGCGTCCGCG TGGTCCAGGC GCCCGGCGCC 
GCGGGCCCAC GACGGGCTGA CCCACGTCCT 
ACGTAACCGT CGAACGGGGC GAGGAGCCGC 
AAGTACCGGA CGATCATCCC GCTCGCCGCC 
GCGCGCCGCC TGAGCTGCTC TCACCGCGCG 
CTCGTCGACC CACGGCTGGC TGTGGATCAG 
GCAGGCCGGG CAGGGCGTCG ACGGCCTCCT 
GGCTTGGCGC AGTGCGCGCG GTCGATGCGG 
GTGCTCGATC GCATCGGCGA CCAGGTCGTA 
TGGCGATGTG GGTCGGCCAC GGCCGACCCG 
ATGCGGTGGG CGCCCGGCAG CGACGGCGCC 
CTCGTCGGCG TATGAATAGA TCGTGGTGTA 
GCGTG CCGTC GGCGCCCAGA GCCTTCGACC 
TGCAGGACCG ACGCCGGGCA GCCCGCCACC 
CGAGGCCAGC CGGGTCCCCT GGAACGGGGA 
CGACCTTCCC CGGCAGGTCC GGCCAGAAGC 
AGGCCGCCCT GGCTGTGCCC GACGAGATCG 
CTGGATCGCG CGGGTCGCGT AC AC C ACGT A 
CACGGAGCCC GCGACCGGGA GAATCCACCC ' 
TTCTTCAACT CGG CCATGT A GTTCCAGGCG 
GCCGGTCCCG GGCACGAAGA GGACGGTCGG 
GGTCCCCCAG CTCCGTCCCG CAGTGCAGCG 
GGTATCTCCA ACGGG GG AG A GGAAACATCC 
CGGAAGC ACG GTGGCGGCCA GCACGGCCGC 
AGGACAAGCG CACGGTGACC TCCACAGGAA 
CTCCCTCCGG AGGGAGCACC TCATCGTGCG 
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100051 
100101 
100151 
100201 
100251 
100301 
100351 
100401 
100451 
100501 
100551 
100601 
100651 
100701 
100751 
100801 
100851 
100901 
100951 
101001 
101051 
101101 
101151 
101201 
101251 
101301 



GCGGCGCCAC 
TTGGCCGGGC 
GCGCGAGGGG 
GAGCCTCTGG 
GTCACGATCA 
TGCCGCCGAG 
GTTTCGACCT 
GAGGACGACC 
GTGCGACGGG 
GCGGCCCTAC 
CCAGCTGGAT 
TGAAGGCGCC 
GACCACCACG 
CGTGCAGATC 
CGCACCTCGC 
CAAGGAGTCC 
TGACGATCGA 
ACCAGGCCGC 
CTCCAGGTAC 
GCAGCCACAG 
AACATCAGGC 
CGACCACAGG 
CCGCGGAGAA 
GGCGCCCGCT 
CGCGTCGGTC 
ACAGGAGCGC 



AGTAGCCGTC 
TCGGCCGGCG 
TCCGTGACCT 
CATGGTCGCC 
CGATGATCGA 
CAGCTCCGCG 
GTGGACGGCC 
CGGCAGTCTC 
TCCTCGGCAC 
GTGTGCAGCG 
CGTCGAGTGG 
CCAGGAGCTC 
TGCGCGGTGA 
GTGCACGGCG 
CGAGGTCGAC 
CGCAGCAGGC 
CGCGATCGGG 
CCACGATCAC 
GCGCCGCGCA 
GCCCACCAGG 
CGCCCTTCAC 
ACCCAGGCGA 
GATCTCCACG 
GGGCGAGGGT 
AGGCTGTGCG 
GCCGACCACC 



AACTGCCCCA 
AAGCGCCCGG 
GGGTGGACGG 
CGTCCGTCCC 
AGTCAGCACG 
CGGAGACCAC 
GACGAGATCG 
CGACGCCGAC 
CGAGGGCACC 
CCCCGTCCTC 
GCCACGTCGA 
CCCGTACCCG 
GCACCGGCAT 
ACCACGCCCC 
GTCCTGCGGG 
CGTACGCGCG 
TCGGCGGCCT 
CGCGACCGAG 
GATTGAGGCT 
TTGGCGGCGA 
CTCCACCGGC 
AGATGACGAC 
CGGTAGAAGC 
GATGGCACCG 
CGGCGTCGGC 
TGGATGACGG 
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CGGGGCTGAG 
GCCCCGCCGC 
TCCGGTTGGA 
CCTCAAGAAC 
CGCAGCATGA 
GACACTGGAC 
CGGAGTGGCT 
TTCTACGCGG 
TGACCCGCCG 
CTCCACATGC 
AGTGGCCCCC 
CTCGCGAGAG 
CCCCGAGGTG 
GCTCCTCCAG 
GTCGCCTCCA 
CGGCACGATC 
GCCACCCCGT 
CCGAGCGCGT 
CTTCTCCTTG 
GCGCGCCCAG 
TCGCTGAACC 
CAGGAGCAGC 
CAAAGGTGCG 
AGGGCCAGCG 
GAGCAGCGCG 
TGATCGAGCC 



T AG TT G AC AG 
CCCGCGCCGT 
CATCCCGGGG 
CGAAGGGAGC 
AGGAAGCGGC 
ATTCCAAASJG 
CGACGGCGTC 
CCCAGC AG CG 
GCGGCCCTGC 
CCCTCCGGCT 
GACACACCGC 
CCTCCTCCGT 
ACCGTCCAGC 
CAGGTGCCGG 
GCAGGACGTG 
AGCAGGCCGA 
GAGCAGGATG 
CGCCCAGCAC 
GCGTCCCGCA 
CGCGACCACG 
GGCCGATCGC 
GCGTTCAGGA 
CCGCGGCGTC 
AGACGCCGAC 
AGGCTGCCGG 
GCTGATGCCG 
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*> * * & o e ?$J/GB, 5J / 0 
"Ti SEPTEMBER 2000 

1013 51 ATGGTCCACA GCAGGCGCTT GCGGTACGTG CCGCTGAGAG TGCCGCCCGC 

1014 01 CGCCCCGGCG GACGGACCGT GGTCGTGCCC CATGCCCGCG AGTGGACCAC 
1014 51 GGCGGCGCGG CACCCGCCAC CGAGCGGCCG CCGGTCGGCT CAGTGCAGCC 
101501 GGGCCTGGGT GGAGGTGTCG CGCTGGTGCG GGATGCCGAG CGGCGGCGGC 
101551 AGCTCGCCCT GCTGCACCCT GACCGTGCGC ACGGGGGCGG GGACCCGGAT 
101 SOI GCCCTCGGCG CGGTAACGCT GGTGCAGGCG CTTGATGAAC TCGTGCTTGA 
101651 TGCGGTACTG GTCGCTGAAC TCGCCGACGC CGAGGATCAC CGTGAAGCTG 
101701 ATCCG CGAGT CGCCGAAGGT GTGGAAGCGG ATCGCCGCCT CGTGGTCGGG 
101751 GACCGCGCCG GTGATCTCGG CCATCACCTC GTCCACCACC TCGGTCGTGA 
101801 CCTTCTCGAC CTGCTCCAGG TCGCTGTCGT AGCTGACCCC GACCTGCACC 
1018 51 ATGATCGACA GCTCCTGCTC GGGGCGGCTG TAGTTGGTCA TGTTGGTGCC 
101901 GGCGAGCTTC GCGTTGGGGA TGATGACGAG GTTGTTGGAG AGCTGGCGGA 
101951 CCGTGGTGTT GCGCCAGTTG ATGTCGACGA CGTAGCCCTC CTCCCCGCTG 
102001 CTGAGCTGGA TGTAGTCGCC GGGCTGCACG GTCTTCGCGG CGAGGATGTG 
1020 51 CACGCCCGCG AAGAGATTGG CGAGCGTGTC CTGCAGTGCG AGGGCGACCG 
102101 CGAGACCTCC CACGCCGAGG GCGGTGAGCA GCGGTGCGAT GGAGATGCCG 
102151 AGGGTCTGAA GGACGATGAG GAAGCCCATC GCGAGCACCA CGACGCGGGT 
102201 GATGTTCACG AAGATGGTGG CCGATCCGGC CACT C CGGAG CGGGACTGTG 

1022 51 CCACGGCCTT CACCAGGCCG GTGACGATCC GGGCCGCCGT GAGCGTGGCG 
102301 GCCAGGATGA GCAGCGCGGT CAGCGTCATG GTGACGTTGC GTCCGGTGCG 

1023 51 CGGCGTGAGC GGCAGCGCGC CCGCCGCGGC GGCGAGCCCG GCGGTGATGG 

1024 01 CCGCGCAGGG CACGAGGGTG CG CAGGGCG T CGACGATGAC GTCGTCACCG 
1024 51 CTCCACCGGG TTTTGCTCGC CCGTTCGCCG AGCCACCTCA GAAGTGCGCG 
102 501 G AGC AG CAGC CCGGCGACGA CGCCGGCGAC GACCGCGATA CCGGCCACGA 
102 5 51 TCCAGTCGTG CAGTGTGAGG GCACGGGTCA TCAGTTCGCT CCCGTCGTAC 
102601 GGGGGGAGTG CGCCTGTGTG GGGCGTATGT GATGTGACGT CACCTTGTGA 
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102 651 TACCTGCTCG ATTCCGGGGA GTGCGGTCAC GCCGGGACGA GAGCTCGGTT 

102701 CCGGCGCGGA CGTCATCCTG CCCCATCCGC CCACGGCAGG CGTGCATACC 

102 751 CCCACCTGGA TCTTCACAGA CCGGCCACGT CTGTCCATGC GCCGATGAGC 

102 8 01 GCGCTGCCCG TGGTAAAGCA TTGAGTCAGG CGATTTGGCC ACTCGGCACT 

102 851 CGGCGGACCG GTCGAGCCGG TCGATCTACG TGAGCGGAGG CGGTTGAGCA 

102 901 TGGCGTCCAT GTGCAGACCC GGAATGTCAC CCGTCAATTC GCACAACC?AG 

102 951 TGGGATCCGC TGGAGGAGAT CATCGTCGGG CGGCTGGAGG GCGCGACCAT 
10 3 001 TCCCTCCAGC CATCCGGTCG tSgCGTGCAA CATCCCGACC TGGGCGGCAC 

103 0 51 GGCTGCAGGG TCTCGCCGCC GGGTTCGAGT ATCCGCAGCG GCTGATCGAG 
10 3101 CCGGCGCAGC AG GAG CTCG A CCAGTTCATC GCTCTCCTGC AATCCCTCGA 
103151 CGTCACAGTG AGACGGCCGG CGGCCGTCGA CCACAAGCAC CGCTTCGGGA 
1032 01 CCCCCGACTG GCAGTCGCGC GGCTTCTGCA ATTCCTGTCC GCGGGACAGC 
103251 ATGCTCGTCG TCGGCGACGA GATCATCGAG ACCCCGATGG CGTGGCCGTG 
103 301 CCGCTGTTTC GAGACGCACT CGTACCGCGA ACTCCTCAAG GACTACTTCC 
103 3 51 GGCGCGGCGC GCGCTGGACG GCGGCGCCGC GCCCCCAGCT CACCGAGGCC 
1034 01 CTGTACGAGA AGGACTTCCG CCCTCCCGAG GAGGGCGAAC GATGCGCTAC 

1034 51 ATCCTCACCG AGTTCGAGCC GGTGTTCGAC GCGGCGGATT TCGTGCGGGC 

1035 01 GGGCCGCGAC CTGTTCGTGA CGCGGAGCAA CGTCGGCAAC CTGCTGGGCA 
103551 TCGAGTGGCT GCGCCGCCAC CTTCGGGCCG GAGTACCGCG TGCCACGAGA 
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> ATTORNEY AMD POWER TO IKSJ^^Ll ' 3 ° ' J/ J 



As a below named inventor, I hereby declare: ^ 

that my residence, post office address and citizenship are as stated below 
next to my name. 

thac I verily believe I am the original, first a,nd sole inventor (if: only one 
name is listed below) or an original, first and joint inventor (if plural inventors are 
named below) of Che invention entitled: POx*YKETIDES AND THEIR SYNTHESIS the specification 
of which [check one(s) applicable) 

_ X_ was filed 3 0 May _ 200 0 as International Pacent Application No . 

FCT/GBQO/02072 , on which U.S. National Stage Application No. 0 9/9 SO, 21 1 
is based; and/or 

was amended by Amendment filed (if applicable), and/or 

is attached to this Declaration, Power of Attorney and Power to 

inspect; 

that I have reviewed and understand the contents of the above- icienr: 1 fled 
specification, including the claims, as amended by any amendment referred to above; and 

that I acknowledge my duty to disclose information which is material to the 
examination cf this application in accordance with Rule 56(a) [37 C.F.R. §1.56 (a) J 

CluAlM UNDER 35 U.S.C. 5119: I hereby claim foreign priority benefits under 35 
U.S.C- 5119 of any foreign application is ) for patent or inventor's certificate listed 
below and have also identified below any foreign application for patent or inventor's 
certificate having a filing date before that of the application of which priority is 
claimed: 

Prior roreign Application < s ) Piling Dat© Priority Claimed 

A gpln Mo . Country pa y-Hon-Year Ya s - t$o 

9912563.5 Great Britain 28-05-1999 Yes 



POVJKR OF ATTORNEY: As inventor, I hereby appoint DAMN, DO HITMAN, AND 
SKZLI»MA*J, p.C. of Philadelphia, Pennsylvania, and the following individual (s ) as my 
attorneys or agents with full power of substitution to prosecute this application and co 
cransacr all business in the United States Patent and Trademark Office connected 
therewith: Patrick J. Hagan, Rog. No. 27,64.3 and Kathloon D . RigaALt, Ph.D., Reg. 43,047. 

power to inspects I hereby give iiann, dorf-man, heurell and skillmaw, f.C- of 

Philadelphia, Pennsylvania or itE duly accredited represGnta.tiv.es now?r to inspect and 
obtain copies of the papers on file relating to this appl-i'cat l on . 

SEND CORRESPONDENCE TO: CUSTOMER NUT^BBR OO0110 _y 

DIRECT XWQttlRiES TO; Telephone: (2lBT*~563^41 0 0 

Facsimile: (215) 5G3-4044 .■ 

I hereby declare that all statements made herein of my own knowledge are true and 
that all statements made on information and belief are believed to be true; and further 
nha: these statements were made with the knowledge that willful false scatementa and the 
like so roade are punishable by fine or imprisonment, or both, under Section 1001 of Title 
18 of the United States Code and that such willful false statements may jeopardize the 
validity of the application or any patent issued thereon. 



POtE OR FIRST .JOINT JNVENTOP (\ r\ \ OECOND JOINT INVEWCOR (IF AJJY) 




. . , , Date 75 VMnnJi. 

C^At^h ^t^ r U. . K * Residence (^ >U^J>*JU^ j*g L \X- ^ 

city " " ' 0 State or Councry City " /^tate or Country 



©fit office Address; ( r-r-\ Pout Office Address^ r\ 

tteet Address J ~~ Street Atfclress 

ity 0 State or Coufltry Zip Code'. City SCAta o t Couiitjty' ~ Zit> Code 



MAY. 2002 1 6:36 ( - MEWBUKN tLLlb 
xninxs o ^ j. err iNVENwn. vie ) 



Qlivnvk , Full Sfaune 



FOURTH JOXWT XMVCNlwn. \it AMY) 



Signature 
Date 



Residence 



City State or Country 



Post Office fw*drese. 
Street Addrese 



City 



Stats of Country 



First 



Middle 



SignaCure 
Date 



Residence 



City 
Ci tisensljip- 



Post Office Address: 



Street Address 



State or Country 



zip Code City 



State or Country 



Zip Code 



2 



